Embedding failures
Covers Ollama connectivity, dimension mismatches, and circuit-breaker recovery.
Symptom: enrichment_status shows state open
The embedding circuit breaker has tripped — repeated failures from Ollama caused the worker to pause.
Likely causes
- Ollama isn't running.
- Ollama is running but doesn't have the configured model pulled.
- Ollama is responding but slowly; requests are timing out.
- Network path is blocked (firewall, wrong
ollama_url).
Diagnostic
# Is Ollama reachable?
curl http://localhost:11434/api/tags
# Is the model pulled?
curl http://localhost:11434/api/tags | jq '.models[].name'
# Expect to see "nomic-embed-text" (or whatever [embedding].model is)
# Recent failures in the log
grep -i "embed" /path/to/project/.shaktiman/shaktimand.log | tail -n 20
Fix
- Ollama not running:
ollama serve(or start the Ollama app on macOS). - Model missing:
ollama pull nomic-embed-text(substitute your model name). - Slow responses: raise
[embedding].timeoutin config, or move Ollama to a machine with a GPU. - Wrong URL: fix
[embedding].ollama_urlinshaktiman.toml.
After fixing the root cause, restart shaktimand — the circuit breaker resets on
startup. You don't need to re-index; the worker picks up where it left off.
The daemon is normally launched by your MCP client, so "restart it" means different things in different setups:
- Claude Code — close the Claude Code session for the project and reopen it.
Claude Code owns the daemon's lifecycle and will spawn a fresh
shaktimandon the next start. If you have multiple Claude Code windows open on the same project, close them all (any surviving window is acting as a proxy or leader). - Cursor / Zed / other MCP clients — close the editor window that launched
shaktimand. Reopen the project to spawn a new daemon. - CLI-only / manual daemon — kill the leader directly:
The next MCP-client invocation (or manualkill "$(cat .shaktiman/daemon.pid)"
shaktimand /path/to/project) acquires the lock and becomes the new leader.
Symptom: enrichment_status shows state disabled
The circuit breaker has moved past open — it stayed unavailable for several
cycles and gave up.
Fix
Same as open above, but you must restart shaktimand after fixing the
root cause. The disabled state is permanent within a process lifetime (by
design — see
ADR addendum A7).
If you've intentionally disabled embeddings (embedding.enabled = false),
disabled is expected and fine.
Symptom: embedding runs but search results feel wrong
Likely causes
- Dimension mismatch.
[embedding].dimsdoesn't match what the model produces. Inserts may be succeeding (no error) but the vectors are structurally wrong. - Wrong task prefixes. nomic-embed-text is trained with
search_query:/search_document:prefixes. Without them, quality drops noticeably. - Old embeddings from a previous model. You changed
modelbut didn't reindex — the vector store still has the old embeddings.
Diagnostic
# Confirm the model produces the dims you think it does
curl http://localhost:11434/api/embed -d '{
"model": "nomic-embed-text",
"input": ["test"]
}' | jq '.embeddings[0] | length'
# → 768 for nomic-embed-text
Fix
- Dims mismatch: correct
[embedding].dims, thenshaktiman reindex— existing vectors are incompatible. - Prefixes: set
[embedding].query_prefix = "search_query: "anddocument_prefix = "search_document: "for nomic-embed-text. - Model change without reindex:
shaktiman reindex --embed.
Symptom: the log shows "text exceeds 28k chars" warnings
Ollama's embed endpoint has an 8192-token context; Shaktiman's client warns conservatively when input text exceeds ~28k characters (roughly 7k tokens). The request is still sent — truncation happens on the Ollama side.
Fix
The only way to reliably fix this is to shrink your chunks. The parser tries
to fit chunks within max_chunk_tokens, but ADR-004's fallback path
(splitByLines) can still produce oversized chunks for huge no-symbol bodies
(e.g. enormous JSON-in-JS files). Consider adding those files to
.shaktimanignore if they're autogenerated.
See also
- Configuration → Embeddings — every knob.
enrichment_statusreference — how to read the output.- Known Limitations → Embedding — the Ollama-only constraint.