Kimetsu Algorithm
How Kimetsu turns memory into measurable savings: every cited memory is credited with the tokens it saved, every injection is charged for the tokens it cost, and kimetsu…
Kimetsu Algorithm
How Kimetsu turns memory into measurable savings: every cited memory is
credited with the tokens it saved, every injection is charged for the tokens
it cost, and kimetsu brain roi reports the net. This page documents the
exact accounting so the ledger can be audited rather than trusted.
Source of truth for the per-kind constants:
crates/kimetsu-brain/src/roi.rs,SAVED_TOKENS_PER_CITATION
Formula
estimated_saved_tokens = Σ (citations_of_kind × saved_tokens_per_kind)
net_tokens = estimated_saved_tokens − injected_tokens
net_usd = net_tokens / 1_000_000 × price_per_mtokcitations: rows inmemory_citationsfor the selected window (attributed to runs viaruns.started_ator to the hook sentinel viamemory_citations.cited_at).injected_tokens: sum of theused_tokensfield acrosscontext.injectedevents in the window. This is the actual token cost the brain added to each prompt: the "spent" side of the ledger.price_per_mtok: resolved from[model] price_per_mtokinproject.toml(user override), falling back to the built-in approximate table inroi.rs. Unknown models produceusd: nullin the JSON output rather than a potentially-wrong number.
Per-kind constants
| Kind | Saved tokens / citation | Reasoning |
|---|---|---|
failure_pattern | 1 500 | Avoids the "try → fail → diagnose → fix" loop. Estimated at ~3 tool calls × ~500 tokens/call. |
fact | 500 | Avoids a docs lookup or user question. ~1 exchange. |
command | 400 | Avoids a --help trial or web search. ~1-2 tool calls. |
convention | 300 | Avoids a code-search to find the project pattern. ~1-2 searches. |
preference | 200 | Avoids one clarifying question. ~1 exchange. |
These are deliberate under-estimates. The goal is for every "net positive" result to be credible, not just flattering. A user who sees a positive number should be able to believe it.
Why under-claiming is deliberate
Kimetsu's ROI story is built on honesty:
- Net CAN be negative, and is shown as such. If the brain injected more tokens than it saved (e.g. every retrieved memory was a silent passenger), the ledger says so.
- Constants are calibrated low: the actual avoided cost of re-discovering
a
failure_patternis often much higher (the model might run the same broken command five times before giving up). We use a conservative floor. - Citations are sparse in MCP hosts: Claude Code currently has no
kimetsu_brain_citetool, so thememory_citationstable is populated only when the pipeline (run coding) is in use. The ledger will under-count savings in pure MCP-host sessions until the cite tool ships. That's fine: under-counting keeps the results trustworthy.
Sanity anchor: Terminal-Bench calibration evidence
The 16-task slice recorded in internal bench runs shows:
| Condition | Cost/win |
|---|---|
| Without brain context | $2.47 |
| With brain context | $0.19 |
This ~13× difference is the observed cost difference, not a model. The per-kind constants were back-derived from this data: if a typical winning run uses ~2 citations of mixed kinds, the implied saving is roughly ($2.47 − $0.19) / 2 ≈ $1.14/citation at Claude Sonnet 3.5 pricing (≈$3/MTok → ~380k tokens / citation). Our constants sum to far less than that for a 2-citation session, confirming that we are well inside the conservative zone.
Limitations
- Citations are sparse in MCP-host sessions. Until
kimetsu_brain_citeis deployed, the ledger only counts citations generated by the autonomous pipeline. Thenet_tokensfigure may appear falsely negative for users who work exclusively via the Claude Code plugin. - Counterfactuals are estimates, not measurements. We cannot observe what the model would have done without the brain context. The constants are conservative estimates based on tool-call patterns, not A/B experiments.
- Price table is approximate. The built-in $/MTok figures are rounded
retail/API-key prices as of June 2026. Set
[model] price_per_mtokinproject.tomlfor accurate billing data. - Injection cost uses input-token pricing. The model also produces output tokens in response to the injected context; those are not counted here (they depend on the model's response length, which varies).
Config reference
# project.toml: override the built-in price table for this project
[model]
model = "claude-sonnet-5"
price_per_mtok = 3.0 # optional; defaults to built-in table# CLI
kimetsu brain roi # last 30 days (default)
kimetsu brain roi --window 7d # last 7 days
kimetsu brain roi --window all # all time
kimetsu brain roi --json # machine-readable RoiReport