Docs
Everything you need to drive dig — the first run, every command, and the policy file that decides where things live.
Quickstart
Index a directory, search it, reorganize by policy, and step back — all reversible.
quick start
# Create a knowledge base at a directory.
$dig init ~/library
# Index files into the content-addressed store (+ search index).
$dig scan
# Search, ranked. Add --mode hybrid for semantic recall.
$dig find "invoice acme 2024"
# Preview every move/rename/label your policy would apply — nothing is touched.
$dig org --dry-run
# Apply it; each change is journaled.
$dig org
# Step back — the last changeset is reversed, disk and all.
$dig undo
Command reference
Read commands (find, export, drift, log) support --json for other harnesses. Mutating commands (org, dedup, reconcile, merge, undo) journal every change.
| Command | What it does |
|---|---|
| dig init <root> | Create a knowledge base at a directory. Writes a per-KB .dig/ directory (store, index, config). |
| dig scan | Index files into the content-addressed store. --dry-run previews. Rebuilds the search index; queues vectors when retrieval is on. |
| dig find <query> | Search the KB, ranked. --mode fts|vector|hybrid (default fts, or the policy mode); --json; --limit. |
| dig embed | Drain the semantic-index backlog. Resumable, per-file commits. Needs a [retrieval] policy. |
| dig export | Emit a reproducible, manifest-pinned dataset. --at <manifest> pins a point in time; --filter by label/path/date. JSONL with provenance. |
| dig org | Apply organization policy (move/rename/label). --dry-run previews the full plan; conflicts are reported, never forced. |
| dig dedup | Collapse duplicates per policy. --dry-run previews. Never deletes the last copy; ties escalate. |
| dig drift | Report divergence from policy + external edits. --json. Surfaces misfiled/misnamed/duplicated/unsorted/pinned. |
| dig reconcile | Converge the KB to policy, one-shot. Auto where rules allow; human moves are pinned and escalated, never overwritten. |
| dig watch | Run continuously: observe, reconcile, escalate. --interval. Drains the semantic backlog per tick. Ctrl-C is a clean stop. |
| dig work <create|list|abort> | Open an isolated work view. Worktree-like; disjoint changesets merge back automatically. |
| dig merge <name> | Merge a work view back. Auto-resolves compatible ops; conflicts escalate surgically. |
| dig policy validate | Lint the policy file. Explains rule matches; unknown keys and path-escapes fail loudly. |
| dig log | Browse change history. --json. Newest first. |
| dig undo | Revert the last changeset. Disk mutations (org/dedup) are reversed; undoing a scan only rewinds history. |
Policy reference
The policy lives at .dig/policy.toml and travels with the KB. Validation is strict — unknown keys and path escapes fail loudly.
[[rule]]
Map matching files to a target folder, name, and labels. At least one of into/rename/label is required.
- name
- Unique rule name.
- match
- Conditions (all must hold): ext, mime, path glob, content_matches regex, size/date.
- into
- Target directory template, KB-root-relative. Vars: {year} {month} {day} {name} {ext}.
- rename
- Target filename template.
- label
- Labels to apply (accumulate across rules).
- autonomy
- "" | "auto" | "propose" — in watch, only "auto" rules act unattended.
[dedup]
Configure duplicate collapsing.
- strategy
- keep-oldest | keep-newest.
- on_conflict
- escalate (default) — never silently delete.
[retrieval]
Opt-in semantic retrieval. Off by default — find stays deterministic FTS.
- mode
- off (default) | hybrid | vector.
- base_url
- Any OpenAI-compatible /embeddings endpoint.
- model
- Embedding model name.
- doc_prefix / query_prefix
- Model task prefixes (model-specific, optional).
- api_key_env
- Env var holding the bearer token — keys never live in the file.
- rrf_k / chunk_size / …
- Tuning knobs (0 = default): rrf_k (60), candidate_factor (4), chunk_size (1000), chunk_overlap (200). Changing chunk size/overlap re-embeds the KB.