How indexing works

When Shaktiman indexes a project, it parses every supported source file, splits each one into semantic chunks, extracts the symbols and call edges, and — if embeddings are enabled — sends the chunks to Ollama in batches to get vectors back. Everything ends up in .shaktiman/index.db (SQLite) and .shaktiman/embeddings.bin (when the default brute_force vector backend is used).

This page covers the behaviour you'll actually notice: cold vs. incremental indexing, the watcher, branch-switch detection, and when to force a rebuild.

The pipeline

For each source file the indexer runs:

Parse — tree-sitter produces an AST. Parse errors don't abort; partial ASTs are allowed through.
Chunk — the parser recursively walks the AST, emitting chunks at the nodes listed in each language's ChunkableTypes (see Supported Languages and ADR-004).
Extract symbols — one row per function / class / method / type.
Extract edges — imports, calls, type_ref, extends, implements, exports.
Write — a single SQLite writer serialises everything into index.db, updating the FTS5 index for keyword search as it goes.
Embed (optional) — if embedding.enabled is true and Ollama is reachable, the embedding worker drains chunks from the queue in batches (embed_batch_size, default 128) and writes vectors to the configured vector store.

Steps 1–5 are fast — for a mid-size Go/TypeScript repo the full cold index completes in a minute or two. Step 6 runs asynchronously and can take much longer on a large repo, but the index is queryable in keyword-fallback mode as soon as step 5 completes. You don't have to wait for embedding.

Cold index vs. incremental

Mode	When	How to trigger
Cold	First run on a project, after `reindex`, or after clearing `.shaktiman/`	`shaktiman index .` or start `shaktimand` (indexes on launch)
Incremental	Every save of a watched file while `shaktimand` is running	Automatic via the file watcher
Manual	CI, scripting, running outside of Claude Code	`shaktiman index .` (refuses if a daemon is already running — it handles indexing itself)

The file watcher

When shaktimand is the leader for a project, it starts an fsnotify-backed watcher over the project root. The watcher:

Watches directories, not individual files — conserves file descriptors on large repos.
Debounces events with a 200 ms window. Rapid saves of the same file collapse into one re-index. The window is a Go default (WatcherDebounceMs in internal/types/config.go) and isn't tunable via TOML today.
Respects .gitignore via the same glob logic the indexer uses. It also honours .shaktimanignore for Shaktiman-specific exclusions (e.g. generated protobuf files you want off the index without adding them to .gitignore).

Implementation: internal/daemon/watcher.go.

Branch-switch detection

If more than ~20 file changes arrive in a single debounce window, the watcher classifies it as a branch switch (common cause: git checkout, git pull, or a large refactor landing). It sends a one-shot signal on branchSwitchCh so the enrichment pipeline can treat the batch as a single coherent event rather than a cascade of per-file re-indexes.

When to re-index explicitly

The incremental watcher handles edits automatically. You only need to force a rebuild when:

You've upgraded Shaktiman and want new parser fixes / schema to apply across the whole corpus. shaktiman reindex . is the right call.
The index appears corrupted (rare — enrichment_status shows impossible counts, or queries return gibberish). reindex fixes this by purging and rebuilding.
You've changed your embedding model or dimensions. Existing vectors are incompatible with the new model; reindex regenerates them.

See Re-indexing for the flow and what's preserved.

Progress and Ctrl-C

The indexer streams progress to stdout (Indexing: N/M files, Embedding: N/M chunks). Progress redraws in place on a TTY and batches per 10 % when output is piped.

Ctrl-C during embedding is safe — the embedding worker periodically flushes vectors to disk (a background goroutine calls RunPeriodicEmbeddingSave). On the next run only unembedded chunks are processed.

Ctrl-C during metadata indexing also works — the SQLite transaction model means a partial index is internally consistent, and the next run will pick up unchanged files quickly (content-hash check skips re-parsing).

The pipeline​

Cold index vs. incremental​

The file watcher​

Branch-switch detection​

When to re-index explicitly​

Progress and Ctrl-C​

See also​