Skip to main content

Backend selection

Two independent choices — metadata backend and vector backend — define the floor for every other performance property. Pick these first.

Metadata backends

BackendQuery latencyWrite throughputMemoryDiskMulti-host
sqlite (default)Very low (in-process)Moderate (single writer)Low (~24 MB page cache)100–500 MB for typical reposNo — file-local
postgresLow (network RTT)HighNone locallyRemoteYes

Pick sqlite when: one developer, one machine, local-first. This is the intended default and almost always right.

Pick postgres when: shared team index, CI infrastructure, cross-host deployment. Mandatory constraint: postgres forces the vector backend to be pgvector or qdrant — see A12 in ADR-003.

Measurement: shaktiman status shows disk footprint; time shaktiman search shows query latency.

Vector backends

BackendSearch latency @ 10k@ 100k@ 500kMemoryDisk
brute_force (default)<20 ms~30 ms>100 msAll vectors in RAMOne file
hnsw<10 ms~10 ms~15 msHot subsetOne file (larger than brute_force)
pgvectorNetwork RTT + ~10 ms+ ~15 ms+ ~20 msNone locallyIn Postgres
qdrantNetwork RTT + ~5 ms+ ~8 ms+ ~10 msNone locallyRemote

Numbers are indicative. Your mileage depends on embedding dimensionality (768 assumed), network (for remote backends), and top-K.

Pick brute_force when: repo has under ~75k chunks (most repos), and you don't mind the full vector set in RAM. Simplest, no surprises.

Pick hnsw when: repo is large (>100k chunks), you're on a single machine, and you want sub-linear search time. Cost: approximate recall instead of exact.

Pick pgvector when: you've already chosen postgres metadata and want a single store.

Pick qdrant when: you need the best vector-search performance and are willing to run a separate service (or use Qdrant Cloud).

Trade-off table

ChoiceProsConsRecommended for
sqlite + brute_forceZero setup, all localCold start on every process open (~10–30 ms)Default. Works up to ~75k chunks.
sqlite + hnswSub-linear searchIndex file can corrupt on unclean shutdownLarge local repos.
postgres + pgvectorSingle remote storepgvector extension must be installedTeam deployments where latency tolerance is ~20 ms.
postgres + qdrantBest vector performanceTwo remote services to operateLargest deployments, quality-sensitive retrieval.

Switching backends

Changing backends requires a reindex:

# Change [vector].backend in shaktiman.toml (or use --vector flag)
shaktiman reindex /path/to/project --embed --vector hnsw

See Re-indexing for the full flow.

See also