BYOK Cloud Batch Processing

How SourcePrep optimizes API usage when you bring your own cloud model.

🔒Privacy Notice

SourcePrep only sends data to external LLM providers when you configure a BYOK (Bring Your Own Key) endpoint and explicitly trigger a build. By default, all processing is local. See our Privacy & Security page for full details.

What Is Batch Processing?

SourcePrep's pipeline analyzes your codebase in multiple stages — cataloguing files, detecting cross-file relationships, enriching each file with deeper analysis, and grouping files into subsystem clusters. Each stage needs to process every relevant file in your project.

When using a local model (e.g., via Ollama), SourcePrep sends one file at a time to the model. This is fast locally and costs nothing.

When using a cloud model (BYOK), sending files one at a time would mean hundreds of individual API calls — each with network latency and repeated overhead. Instead, SourcePrep batches multiple files into a single API call, getting results for many files at once.

What Data Is Sent?

Each batch contains the same data that would be sent in individual calls:

Catalogue stage: File names, symbol names, imports, and the first ~30 lines of each file.
Inferred edges stage: Source code for each file, plus a list of known file paths in the project (for relationship detection).
Epistemic enrichment: Source excerpts (up to 150 lines for code, up to 3000 lines for documentation), plus summaries of neighboring files.
Clustering: Previously-generated summaries of files in each cluster (not raw source code).

No data is sent to SourcePrep servers. All API calls go directly from your machine to the LLM provider you configured (e.g., OpenAI, Anthropic, Google).

How Batching Works

SourcePrep automatically selects a batch profile based on the model you configure. The profile determines how many files are processed per API call at each pipeline stage.

Profile	Models	Typical items/call
Large	Claude Sonnet 4.6+, Gemini 2.5 Pro	50–100
Standard	GPT-4.1, GPT-5, Claude 3	25–50
Compact	DeepSeek, GPT-4o, Gemini Flash, Haiku 4.5	10–20
Local	Ollama, LM Studio	1 (no batching)

The profile is selected automatically when you configure a model. You can override it in Settings > AI Models if needed.

Loading component preview…

Live preview: Configure BYOK endpoints for OpenAI, Anthropic, Google, and custom providers.

Benefits

Faster builds: A 200-file project completes in ~8 API calls instead of ~200, finishing in under a minute instead of 10–15 minutes.
Lower cost: Batching amortizes per-call overhead (system prompts, instructions) across many items, saving ~25–33% on token costs.
Same results: Each file receives the same analysis whether processed individually or in a batch. The prompts and expected outputs are identical.

Structured Output & Reliability

When supported by the provider (OpenAI, Anthropic, Google), SourcePrep uses structured output mode — a JSON schema that guarantees the model returns valid, correctly-formatted results. This eliminates parse errors and ensures every item in the batch produces a usable result.

For providers without structured output support, SourcePrep falls back to robust JSON extraction with automatic retry for any items that fail to parse.

Error Handling

If a batched call fails (timeout, rate limit, etc.), SourcePrep:

Retries the batch once.
If it fails again, splits the batch in half and retries each half.
As a last resort, falls back to processing items individually.

No data is lost — failed items are always retried, and partial progress is saved.

💡Cost Estimation

Before starting a build, the dashboard shows an estimated number of API calls and approximate cost based on your configured model. For example, a 200-file project with GPT-4.1 mini typically costs under $0.50.