learn

recall@k and hit@k — the IR metrics dig reports

recall@k, hit@k, and hit@5 are standard information-retrieval metrics for ranked search. dig scores 98.0% hit@5 on the full LongMemEval set, fully local on CPU, beating the published 96.6% bar.

dig is a retrieval layer, so it measures itself with standard information-retrieval (IR) metrics over ranked results — not answer quality, which belongs to the agent driving it.

What the metrics mean

For each question, dig ranks the candidate set and the metrics score how well the evidence lands near the top:

  • recall@k — the fraction of evidence items present in the top-k results, averaged across questions.
  • hit@k — whether any evidence item appears in the top-k (a hit/miss per question).
  • hit@5 — the rank-5 cutoff specifically; the common headline number. Some benchmarks call this recall_any@5.

ndcg@10 and mrr add ranking-quality detail. You can reproduce them across modes:

dig find "contract renewal terms" --mode hybrid --limit 5

dig's measured scores

On the full official LongMemEval-S set (500 questions, 19,829 sessions), dig's hybrid pipeline scores 98.0% hit@5, beating MemPalace's published 96.6% recall_any@5 bar — same model class, fully local on CPU with a small open embedding model, no reranker and no LLM in the loop. On LoCoMo, hybrid reaches 91.3% hit@5.

See the full scoreboard · Hybrid retrieval