Performance tuning
A quick-reference page listing the tuning knobs and their defaults. For the rationale, measurement recipes, and trade-offs, see the Performance section.
User-facing knobs (in shaktiman.toml)
| Knob | Default | Effect summary |
|---|---|---|
search.max_results | 10 | Max results per search. Higher = more scoring work. |
search.default_mode | locate | locate (compact) or full (inline source). |
search.min_score | 0.15 | Relevance floor. Higher = less noise. |
context.budget_tokens | 4096 | Default assembly budget. Higher = more work. |
embedding.batch_size | 128 | Ollama batch size. Higher = better throughput if hardware allows. |
embedding.timeout | "120s" | HTTP timeout per batch. Affects circuit-breaker sensitivity. |
embedding.query_prefix / document_prefix | "" | Model-specific task prefixes. Not performance per se, but affects recall. |
vector.backend | brute_force | See Backend selection. |
database.backend | sqlite | See Backend selection. |
Knobs not exposed via TOML (but in DefaultConfig)
These ship with defaults that are almost always correct. If you need to
change them, edit internal/types/config.go:DefaultConfig and rebuild:
| Knob | Default | Effect summary |
|---|---|---|
EnrichmentWorkers | 4 | Parallel parse / extract workers. |
WatcherDebounceMs | 200 | File-event coalescing window. |
WriterChannelSize | 500 | SQLite writer backpressure. |
MaxBudgetTokens | 4096 | Cap for assembly (usually matches context.budget_tokens). |
Tokenizer | cl100k_base | Tokenizer for budget accounting. |
Per-call knobs (MCP / CLI)
Many MCP tools accept per-call overrides. Full list under each tool's reference:
search—mode,max_results,min_score,path,scope.context—budget_tokens,scope.dependencies—direction,depth,scope.diff—since,limit,scope.
Per-call wins over TOML defaults.
Tuning playbook
Don't tune speculatively. Reach for these in order when something's actually slow:
- Measure first.
time shaktiman search "query" --root .andshaktiman enrichment-status. Baseline is the first commit. - Lower
max_resultsfor interactive use. 10 is plenty for most agent workflows. - Raise
min_scoreif marginal hits dominate result sets. - Switch to
hnsw(single-dev) orqdrant(shared) once the repo grows past ~75k chunks. - Raise
batch_sizeif your Ollama is a GPU and the queue is lagging. - Raise
timeoutif your Ollama is occasionally slow and the circuit breaker is tripping unnecessarily.
If you reach step 6 and queries are still slow, consult Troubleshooting → Performance problems.
See also
- Config File reference — full TOML schema with validation rules.
- Performance → Overview — the four axes and how to measure each.