Kimetsu logoKimetsu

Kimetsu Algorithm

How Kimetsu turns memory into measurable savings: every cited memory is credited with the tokens it saved, every injection is charged for the tokens it cost, and kimetsu…

Kimetsu Algorithm

How Kimetsu turns memory into measurable savings: every cited memory is credited with the tokens it saved, every injection is charged for the tokens it cost, and kimetsu brain roi reports the net. This page documents the exact accounting so the ledger can be audited rather than trusted.

Source of truth for the per-kind constants: crates/kimetsu-brain/src/roi.rs, SAVED_TOKENS_PER_CITATION


Formula

estimated_saved_tokens = Σ (citations_of_kind × saved_tokens_per_kind)
net_tokens             = estimated_saved_tokens − injected_tokens
net_usd                = net_tokens / 1_000_000 × price_per_mtok
  • citations: rows in memory_citations for the selected window (attributed to runs via runs.started_at or to the hook sentinel via memory_citations.cited_at).
  • injected_tokens: sum of the used_tokens field across context.injected events in the window. This is the actual token cost the brain added to each prompt: the "spent" side of the ledger.
  • price_per_mtok: resolved from [model] price_per_mtok in project.toml (user override), falling back to the built-in approximate table in roi.rs. Unknown models produce usd: null in the JSON output rather than a potentially-wrong number.

Per-kind constants

KindSaved tokens / citationReasoning
failure_pattern1 500Avoids the "try → fail → diagnose → fix" loop. Estimated at ~3 tool calls × ~500 tokens/call.
fact500Avoids a docs lookup or user question. ~1 exchange.
command400Avoids a --help trial or web search. ~1-2 tool calls.
convention300Avoids a code-search to find the project pattern. ~1-2 searches.
preference200Avoids one clarifying question. ~1 exchange.

These are deliberate under-estimates. The goal is for every "net positive" result to be credible, not just flattering. A user who sees a positive number should be able to believe it.


Why under-claiming is deliberate

Kimetsu's ROI story is built on honesty:

  1. Net CAN be negative, and is shown as such. If the brain injected more tokens than it saved (e.g. every retrieved memory was a silent passenger), the ledger says so.
  2. Constants are calibrated low: the actual avoided cost of re-discovering a failure_pattern is often much higher (the model might run the same broken command five times before giving up). We use a conservative floor.
  3. Citations are sparse in MCP hosts: Claude Code currently has no kimetsu_brain_cite tool, so the memory_citations table is populated only when the pipeline (run coding) is in use. The ledger will under-count savings in pure MCP-host sessions until the cite tool ships. That's fine: under-counting keeps the results trustworthy.

Sanity anchor: Terminal-Bench calibration evidence

The 16-task slice recorded in internal bench runs shows:

ConditionCost/win
Without brain context$2.47
With brain context$0.19

This ~13× difference is the observed cost difference, not a model. The per-kind constants were back-derived from this data: if a typical winning run uses ~2 citations of mixed kinds, the implied saving is roughly ($2.47 − $0.19) / 2 ≈ $1.14/citation at Claude Sonnet 3.5 pricing (≈$3/MTok → ~380k tokens / citation). Our constants sum to far less than that for a 2-citation session, confirming that we are well inside the conservative zone.


Limitations

  1. Citations are sparse in MCP-host sessions. Until kimetsu_brain_cite is deployed, the ledger only counts citations generated by the autonomous pipeline. The net_tokens figure may appear falsely negative for users who work exclusively via the Claude Code plugin.
  2. Counterfactuals are estimates, not measurements. We cannot observe what the model would have done without the brain context. The constants are conservative estimates based on tool-call patterns, not A/B experiments.
  3. Price table is approximate. The built-in $/MTok figures are rounded retail/API-key prices as of June 2026. Set [model] price_per_mtok in project.toml for accurate billing data.
  4. Injection cost uses input-token pricing. The model also produces output tokens in response to the injected context; those are not counted here (they depend on the model's response length, which varies).

Config reference

# project.toml: override the built-in price table for this project
[model]
model = "claude-sonnet-5"
price_per_mtok = 3.0   # optional; defaults to built-in table
# CLI
kimetsu brain roi              # last 30 days (default)
kimetsu brain roi --window 7d  # last 7 days
kimetsu brain roi --window all # all time
kimetsu brain roi --json       # machine-readable RoiReport

On this page