Tutorial

Code Review at Scale: Using the DeepSeek API for Full-Repo Analysis

Published 30 April 2026 · 9 min read · code analysis deepseek api · document processing deepseek

Traditional code review tools — ESLint, SonarQube, CodeClimate — are excellent at catching known patterns: unused variables, style violations, common security anti-patterns. What they can't do is reason about your specific architecture, trace logic flow across 30 files, or understand that this database query is dangerous in the context of how the caller is built.

DeepSeek V4's 1 million-token context makes a different kind of analysis possible: load the entire codebase, ask a question, get a structured answer that understands the whole system. This tutorial shows you exactly how to build it.

What full-repo code analysis looks like in practice

Before writing code, it helps to understand what the model is actually doing differently from a linter:

Cross-file reasoning: A SQL injection in utils/db.py that's only exploitable because of how api/users.py calls it
Missing test coverage: "The PaymentProcessor.refund() method has no tests, but it's called from 4 different places"
Design smells: God objects, circular dependencies, inconsistent error handling patterns across modules
Architecture drift: "The README says this is a hexagonal architecture but the domain layer imports directly from the infrastructure layer in 6 places"

These are things a human senior engineer would catch in a thorough review. With DeepSeek V4, you can run this analysis in 90 seconds.

Step 1: Extract text from your repository

The API takes text, not files. You need to serialize your codebase into a flat text representation. The simplest approach is to zip your source directory and extract relevant files:

import zipfile
import io

TEXT_EXTENSIONS = {
    ".py", ".js", ".ts", ".tsx", ".jsx", ".go", ".rs",
    ".java", ".c", ".h", ".cpp", ".hpp", ".cs", ".rb",
    ".php", ".swift", ".kt", ".sh", ".yml", ".yaml",
    ".json", ".toml", ".md", ".sql", ".html", ".css",
}

def extract_repo(zip_bytes: bytes, max_file_bytes: int = 500_000) -> str:
    parts = []
    with zipfile.ZipFile(io.BytesIO(zip_bytes)) as z:
        for info in z.infolist():
            if info.is_dir() or info.file_size > max_file_bytes:
                continue
            ext = "." + info.filename.rsplit(".", 1)[-1].lower() if "." in info.filename else ""
            if ext not in TEXT_EXTENSIONS:
                continue
            # Skip common non-source directories
            if any(d in info.filename for d in ["node_modules/", ".venv/", "__pycache__/", ".git/"]):
                continue
            try:
                text = z.read(info).decode("utf-8", errors="ignore")
                parts.append(f"=== {info.filename} ===\n{text}")
            except Exception:
                pass
    return "\n\n".join(parts)

Step 2: Build the system prompt

The system prompt is what separates a generic analysis from a useful one. For code review, you want structured output with actionable sections:

CODE_REVIEW_PROMPT = """You are a senior staff engineer conducting a thorough code review.
Analyze the provided repository for the following categories:

**Critical Issues** — Security vulnerabilities, data loss risks, authentication bypasses, SQL injection,
XSS, SSRF, insecure deserialization, hard-coded credentials, exposed secrets.

**Design Issues** — God classes, excessive coupling, circular dependencies, missing abstractions,
inconsistency between stated architecture and actual structure.

**Performance Issues** — N+1 queries, missing indexes, synchronous blocking in async context,
unbounded loops or memory allocations.

**Test Coverage Gaps** — Critical paths with no tests, mocked tests that can't catch real bugs,
missing edge case coverage.

**Quick Wins** — Low-effort improvements with meaningful impact.

For each issue, cite the specific file(s) and approximate line range.
Output well-formatted Markdown. Be direct and precise. Do not summarize what is already obvious."""

Step 3: Call the DeepSeek API with streaming

For large inputs, streaming is essential — a 500k-token analysis can take 3–5 minutes to complete, and streaming shows progress rather than a spinning indicator:

from openai import OpenAI

client = OpenAI(
    api_key="your-deepseek-api-key",
    base_url="https://api.deepseek.com/v1",
)

def review_repo(repo_text: str) -> str:
    stream = client.chat.completions.create(
        model="deepseek-chat",
        messages=[
            {"role": "system", "content": CODE_REVIEW_PROMPT},
            {"role": "user", "content": f"Repository to review:\n\n{repo_text}"},
        ],
        temperature=0.2,   # lower = more deterministic, better for analysis
        stream=True,
    )
    result = ""
    for chunk in stream:
        delta = chunk.choices[0].delta.content or ""
        result += delta
        print(delta, end="", flush=True)  # real-time streaming
    return result

Step 4: Estimating cost before you run

DeepSeek V4 charges $0.27 per million input tokens and $1.10 per million output tokens. A rough estimate for a code review:

50k-line Python repo → ~200k tokens in → ~30k tokens out
Input cost: 200k × $0.27/M = $0.054
Output cost: 30k × $1.10/M = $0.033
Total: ~$0.09 per full-repo review

You can pre-screen the token count before calling the API to avoid surprises:

def estimate_tokens(text: str) -> int:
    # ~4 characters per token is a reliable approximation for English/code
    return len(text) // 4

repo_text = extract_repo(zip_bytes)
estimated_tokens = estimate_tokens(repo_text)
estimated_cost = (estimated_tokens / 1_000_000) * 0.27
print(f"Estimated: {estimated_tokens:,} tokens, ${estimated_cost:.3f} input cost")

Document processing: the same pattern applies

Everything above applies equally to document analysis. For contract review:

CONTRACT_REVIEW_PROMPT = """You are a paralegal AI reviewing legal contracts.
For each document, extract and summarize:

**Parties** — Full legal names and roles
**Term** — Start date, end date, renewal conditions
**Key Obligations** — What each party must do, by when
**Payment Terms** — Amounts, schedule, late payment penalties
**Termination** — Conditions, notice requirements, consequences
**Liability & Indemnification** — Caps, exclusions, indemnity scope
**Red Flags** — Non-standard clauses, one-sided terms, missing protections
**Action Items** — What needs to happen before this can be signed

Output structured Markdown. Be precise and cite section numbers."""

def review_documents(pdf_texts: list[tuple[str, str]]) -> str:
    combined = "\n\n".join(
        f"## Document: {name}\n\n{text}"
        for name, text in pdf_texts
    )
    stream = client.chat.completions.create(
        model="deepseek-chat",
        messages=[
            {"role": "system", "content": CONTRACT_REVIEW_PROMPT},
            {"role": "user", "content": combined},
        ],
        temperature=0.1,
        stream=True,
    )
    # ... same streaming loop

Production considerations

Handling timeouts

Large context analyses can take several minutes. Set generous HTTP timeouts — 600 seconds for the read timeout is reasonable. If you're building a web app, use streaming to keep the connection alive and show progress.

Error handling and retries

DeepSeek's API can return 429 (rate limit) or 503 (overload) under high load. Implement exponential backoff with jitter for retries. Don't retry immediately on 413 (input too large) — that requires reducing the input.

Caching common analyses

If you're running code reviews on a repo that hasn't changed between runs, cache the output keyed by the repo's git SHA. DeepSeek V4's analysis of a given codebase is deterministic enough (at temperature=0.2) that caching is valid.

Don't want to build this yourself? Agent Workbench handles extraction, streaming, credits, and structured output — no code required.

Try free — 2 runs/day

What you can't do with code analysis via LLM

LLM-based code review is not a replacement for static analysis tools. It's a complement. ESLint catches syntax errors and style issues faster and cheaper. SonarQube has curated rule sets for common vulnerability patterns validated against thousands of CVEs. DeepSeek V4 adds the layer that static tools can't: contextual reasoning about your specific architecture, business logic, and cross-file design.

The best setup is both: automated static analysis in CI (free, fast, no tokens) plus periodic LLM review for the architecture-level issues that linters miss.