Performance tuning

A quick-reference page listing the tuning knobs and their defaults. For the rationale, measurement recipes, and trade-offs, see the Performance section.

User-facing knobs (in `shaktiman.toml`)

Knob	Default	Effect summary
`search.max_results`	10	Max results per search. Higher = more scoring work.
`search.default_mode`	`locate`	`locate` (compact) or `full` (inline source).
`search.min_score`	0.15	Relevance floor. Higher = less noise.
`context.budget_tokens`	4096	Default assembly budget. Higher = more work.
`embedding.batch_size`	128	Ollama batch size. Higher = better throughput if hardware allows.
`embedding.timeout`	`"120s"`	HTTP timeout per batch. Affects circuit-breaker sensitivity.
`embedding.query_prefix` / `document_prefix`	`""`	Model-specific task prefixes. Not performance per se, but affects recall.
`vector.backend`	`brute_force`	See Backend selection.
`database.backend`	`sqlite`	See Backend selection.

Knobs not exposed via TOML (but in `DefaultConfig`)

These ship with defaults that are almost always correct. If you need to change them, edit internal/types/config.go:DefaultConfig and rebuild:

Knob	Default	Effect summary
`EnrichmentWorkers`	4	Parallel parse / extract workers.
`WatcherDebounceMs`	200	File-event coalescing window.
`WriterChannelSize`	500	SQLite writer backpressure.
`MaxBudgetTokens`	4096	Cap for assembly (usually matches `context.budget_tokens`).
`Tokenizer`	`cl100k_base`	Tokenizer for budget accounting.

Per-call knobs (MCP / CLI)

Many MCP tools accept per-call overrides. Full list under each tool's reference:

search — mode, max_results, min_score, path, scope.
context — budget_tokens, scope.
dependencies — direction, depth, scope.
diff — since, limit, scope.

Per-call wins over TOML defaults.

Tuning playbook

Don't tune speculatively. Reach for these in order when something's actually slow:

Measure first. time shaktiman search "query" --root . and shaktiman enrichment-status. Baseline is the first commit.
Lower max_results for interactive use. 10 is plenty for most agent workflows.
Raise min_score if marginal hits dominate result sets.
Switch to hnsw (single-dev) or qdrant (shared) once the repo grows past ~75k chunks.
Raise batch_size if your Ollama is a GPU and the queue is lagging.
Raise timeout if your Ollama is occasionally slow and the circuit breaker is tripping unnecessarily.

If you reach step 6 and queries are still slow, consult Troubleshooting → Performance problems.

User-facing knobs (in shaktiman.toml)​

Knobs not exposed via TOML (but in DefaultConfig)​

Per-call knobs (MCP / CLI)​

Tuning playbook​

See also​

User-facing knobs (in `shaktiman.toml`)

Knobs not exposed via TOML (but in `DefaultConfig`)

Per-call knobs (MCP / CLI)

Tuning playbook

See also