Docs

Everything you need to drive dig — the first run, every command, and the policy file that decides where things live.

Quickstart

Index a directory, search it, reorganize by policy, and step back — all reversible.

quick start
# Create a knowledge base at a directory.
$dig init ~/library
# Index files into the content-addressed store (+ search index).
$dig scan
# Search, ranked. Add --mode hybrid for semantic recall.
$dig find "invoice acme 2024"
# Preview every move/rename/label your policy would apply — nothing is touched.
$dig org --dry-run
# Apply it; each change is journaled.
$dig org
# Step back — the last changeset is reversed, disk and all.
$dig undo

Command reference

Read commands (find, export, drift, log) support --json for other harnesses. Mutating commands (org, dedup, reconcile, merge, undo) journal every change.

CommandWhat it does
dig init <root>Create a knowledge base at a directory. Writes a per-KB .dig/ directory (store, index, config).
dig scanIndex files into the content-addressed store. --dry-run previews. Rebuilds the search index; queues vectors when retrieval is on.
dig find <query>Search the KB, ranked. --mode fts|vector|hybrid (default fts, or the policy mode); --json; --limit.
dig embedDrain the semantic-index backlog. Resumable, per-file commits. Needs a [retrieval] policy.
dig exportEmit a reproducible, manifest-pinned dataset. --at <manifest> pins a point in time; --filter by label/path/date. JSONL with provenance.
dig orgApply organization policy (move/rename/label). --dry-run previews the full plan; conflicts are reported, never forced.
dig dedupCollapse duplicates per policy. --dry-run previews. Never deletes the last copy; ties escalate.
dig driftReport divergence from policy + external edits. --json. Surfaces misfiled/misnamed/duplicated/unsorted/pinned.
dig reconcileConverge the KB to policy, one-shot. Auto where rules allow; human moves are pinned and escalated, never overwritten.
dig watchRun continuously: observe, reconcile, escalate. --interval. Drains the semantic backlog per tick. Ctrl-C is a clean stop.
dig work <create|list|abort>Open an isolated work view. Worktree-like; disjoint changesets merge back automatically.
dig merge <name>Merge a work view back. Auto-resolves compatible ops; conflicts escalate surgically.
dig policy validateLint the policy file. Explains rule matches; unknown keys and path-escapes fail loudly.
dig logBrowse change history. --json. Newest first.
dig undoRevert the last changeset. Disk mutations (org/dedup) are reversed; undoing a scan only rewinds history.

Policy reference

The policy lives at .dig/policy.toml and travels with the KB. Validation is strict — unknown keys and path escapes fail loudly.

[[rule]]

Map matching files to a target folder, name, and labels. At least one of into/rename/label is required.

name
Unique rule name.
match
Conditions (all must hold): ext, mime, path glob, content_matches regex, size/date.
into
Target directory template, KB-root-relative. Vars: {year} {month} {day} {name} {ext}.
rename
Target filename template.
label
Labels to apply (accumulate across rules).
autonomy
"" | "auto" | "propose" — in watch, only "auto" rules act unattended.

[dedup]

Configure duplicate collapsing.

strategy
keep-oldest | keep-newest.
on_conflict
escalate (default) — never silently delete.

[retrieval]

Opt-in semantic retrieval. Off by default — find stays deterministic FTS.

mode
off (default) | hybrid | vector.
base_url
Any OpenAI-compatible /embeddings endpoint.
model
Embedding model name.
doc_prefix / query_prefix
Model task prefixes (model-specific, optional).
api_key_env
Env var holding the bearer token — keys never live in the file.
rrf_k / chunk_size / …
Tuning knobs (0 = default): rrf_k (60), candidate_factor (4), chunk_size (1000), chunk_overlap (200). Changing chunk size/overlap re-embeds the KB.