Kimetsu logoKimetsu
How Kimetsu Works

Retrieval models

Which embedder and reranker Kimetsu ships, why, and how to swap or re-benchmark them.

Which embedder and reranker Kimetsu ships, why, and how to swap or re-benchmark them.

The local retrieval stack is an embedder plus a cross-encoder reranker, both warm inside the embed daemon. Defaults were chosen with kimetsu brain bench, a benchmark seeded from real exported memories (100 memories in confusable topic clusters, 210 cases: keyword, paraphrase, oblique, confusable, no-answer, multi-answer):

embedderrerankerrecall@2recall@4MRRmean mspeak RSS
jina-v2-base-codejina-turbo0.9540.9750.9335522.0 GB
jina-v2-base-codejina-tiny0.9490.9750.9314142.0 GB
jina-v2-base-codeminilm-l-40.9490.9590.9273722.3 GB
jina-v2-base-codetinybert-l-20.9140.9490.9141321.5 GB
jina-v2-base-codeoff0.9290.9390.9151061.5 GB
bge-small-en-v1.5off0.9310.9660.911446359 MB

The default (jina-v2-base-code + ms-marco-tinybert-l-2-v2) is the fastest reranked combo, within ~2% MRR of the grid best, and fits the hook's 300ms budget. jina-v2 beats bge-small across every reranker on this corpus; any reranker beats none. The lean-RAM option is bge-small-en-v1.5 (~360-525 MB at ~1-3% lower MRR).

Swapping models (takes effect after a daemon restart):

kimetsu config set embedder.model bge-small-en-v1.5
kimetsu config set embedder.reranker jina-reranker-v1-tiny-en   # or off, any HF ONNX id
kimetsu brain reindex          # REQUIRED after an embedder change
kimetsu brain daemon stop      # next prompt spawns a daemon with the new models

KIMETSU_BRAIN_EMBEDDER overrides per process.

Re-judging as your brain grows:

kimetsu brain export bench/memories-export.json   # refresh the dataset source
kimetsu brain bench                               # full grid -> summary.md
kimetsu brain eval                                # fixture-based quick check

Watch-item: the semantic floor (broker.min_semantic_score, 0.35) was calibrated on bge-family cosine distributions; re-tune it against kimetsu brain eval after an embedder change. The remote server runs its own operator-configured reranker; see Kimetsu Remote.