compare
dig vs a vector database — a complete retriever, not storage to operate
A vector database (Pinecone, Weaviate, Qdrant, Chroma) is storage you run and embed into. dig is a complete local retriever — hybrid FTS plus vector, management, and reversibility, with no service to operate.
A vector database is one component. dig is the whole retriever.
What each one is
A vector database — Pinecone, Weaviate, Qdrant, Chroma — stores embeddings and serves nearest-neighbor queries. You operate it (hosted or self-run), generate the embeddings, write the ingestion, and build everything around it: chunking, lexical search, management.
dig is a local-first binary you drive from the command line or over MCP. It indexes a directory of real files and retrieves with a hybrid FTS + vector pipeline — the vector half plus lexical search and management in one tool. No service to provision.
Where they differ
- Scope. A vector DB is similarity search over vectors you supply. dig is the full retriever:
dig find --mode fts|vector|hybridcombines lexical and semantic in one query. - Storage. A vector DB is a store to keep in sync with your source. dig's index is a derived view over a content-addressed store;
dig scanrebuilds it — nothing to diverge. - Reversibility. dig journals every mutation and
dig undowalks it back. A vector store offers no changeset history. - Management. dig also dedupes, detects drift, and reconciles the underlying files to a policy you declare. A vector DB stores; it doesn't organize.
When to use which
Reach for a vector database when you're building a custom system that needs scalable, dedicated similarity search and you'll operate the surrounding pipeline.
Reach for dig when you want a complete local retriever — hybrid search, management, reversibility — with no service to run and no embedding pipeline to build.