Pipeline

Context Assembly

Retrieving code is easy. Assembling it into a coherent prompt that fits the context window while maximizing information density is hard. Five steps: retrieve, score, budget, compress, format — each tunable, all running locally on your machine.

Why Assembly Matters

Naive RAG dumps top-K chunks and hopes the model picks signal from noise. SourcePrep treats context as a budget to be allocated — every chunk is scored, weighted by intent, compressed by relevance, and formatted with citations the AI can actually navigate.

The Assembly Pipeline

Five steps run in sequence. Each step has knobs in the dashboard's Context Assembler panel.

RETRIEVAL

Candidates from multiple sources — semantic search (top-K vector similarity), keyword search (BM25), and the Code Graph (related definitions and call sites, when trace expansion is on).

SCORING

Re-scored on relevance (vector distance), the kind of question being asked (auto-classified — see the MCP reference for the full intent taxonomy), file-type weights (docs vs code vs tests), path weights (per-directory multipliers), priming (AGENTS.md and primer files get a global boost), and recency.

BUDGETING

You specify a max_chars or max_tokens budget. Chunks are sorted by final score, greedily added until near full, with class headers and function signatures preserved as "glue" for context.

COMPRESSION

Two engines, both CPU-only. Code is structurally compressed at a Level of Detail (LOD) determined by relevance — top results stay full, mid-relevance shows signatures, peripheral files show names only (3–20×, no model). Documentation is compressed with a lightweight language model that strips filler while preserving meaning (~2.4×).

FORMATTING

Final output is XML, Markdown, or JSON, with file path citations (@src/file.ts:10-20) that AI editors parse to render "Click to Open" links.

claude — my-project

❯

What an assembled context payload looks like when an agent calls prep.

Loading component preview…

The dashboard view of the same assembled payload — every chunk carries its LOD badge and source citation.

Panel Controls

The Context Assembler panel in the dashboard lets you tune the pipeline.

Retrieval Settings

Chunks (k): how many distinct code blocks to retrieve from the vector database.
Default: 20. Increase for broad queries, decrease for precision.
Max chars: hard limit for the final output. Chunks stop being added once this budget is hit.
Default: 24,000 (fits comfortably in 32k windows).

Output Toggles

Sources: adds the @path/to/file:line-line citation header to each chunk.
Essential for AI editors to render clickable links.
Scores: appends the relevance score (0.0–1.0) to each chunk.
Useful when debugging why a chunk was included.
Structured: returns a JSON object instead of a text blob.
For programmatic integrations.