Install & Host Wiring
Kimetsu is a single Rust binary.
Installing Kimetsu
Kimetsu is a single Rust binary. There's really only one choice to make at
install time, lean vs semantic (embeddings), because that's the only part
baked into the binary. Which host agents you use (Claude Code, Codex, Pi,
OpenClaw) is a runtime choice you change anytime with kimetsu plugin install/uninstall, with no reinstall. The official prebuilt + npm binaries
include all six host integrations; a bare source cargo install is minimal and
adds them with --features pi,openclaw.
cargo
# Default lean build: fast lexical (FTS) retrieval, no model download
cargo install kimetsu-cli
# Semantic build: fastembed + ONNX; first run downloads the embedding model
cargo install kimetsu-cli --features embeddings
# Add the Pi + OpenClaw host integrations to a source build (prebuilts already have them)
cargo install kimetsu-cli --features pi,openclaw
# Everything:
cargo install kimetsu-cli --features embeddings,pi,openclaw
# From source
cargo install --path crates/kimetsu-cli # add --features embeddings,pi,openclaw for full buildnpm
Installs the prebuilt binary for your platform, no Rust required:
npm install -g kimetsu-ai # lean build (all host integrations included)
kimetsu npm-flavor embeddings # one-time: switch to the semantic build (it persists)npm pulls only the matching per-platform package (@kimetsu-ai/*) via
optionalDependencies, so there's no postinstall download, and it works under
npm install --ignore-scripts. kimetsu npm-flavor embeddings fetches the
semantic build once and remembers the choice (no env var to keep exported);
kimetsu npm-flavor lean switches back, and kimetsu npm-flavor status shows
the current one. (The KIMETSU_NPM_FLAVOR env var still works as a per-run
override.) The embeddings build is available where ONNX Runtime prebuilts exist
(Linux x64, macOS Apple Silicon, Windows x64); elsewhere it stays lean. See
npm/ for details.
Pre-built archives
For Linux / macOS / Windows on every
GitHub Release. Extract the archive and put
kimetsu / kimetsu.exe somewhere on PATH (~/.local/bin, /usr/local/bin,
or %USERPROFILE%\.cargo\bin). Every prebuilt archive, lean and embeddings,
bundles all six host integrations, so switching hosts never needs a reinstall.
Lean archives are published for Linux,
macOS Intel, macOS Apple Silicon, and Windows. Embeddings archives are
published where ONNX Runtime prebuilts are available: Linux x86_64,
macOS Apple Silicon, and Windows x86_64.
Health check, updates, uninstall
kimetsu --version
kimetsu doctor # checks paths, brain.db, embedder, MCP, bridge
kimetsu update --check
kimetsu update # updates discovered kimetsu binaries on PATH/current install
kimetsu uninstall --dry-run
kimetsu uninstall --yes # removes discovered kimetsu binarieskimetsu update downloads the matching GitHub Release archive for your
platform and flavor, then updates the current executable plus verified
kimetsu copies in known install locations such as Cargo bin, ~/.local/bin,
/usr/local/bin, or %USERPROFILE%\.cargo\bin. It does not scan the whole
disk. kimetsu uninstall removes those same verified binaries; it leaves
project .kimetsu/ directories and the user brain intact unless you explicitly
pass --delete-user-data.
Prerequisites: Rust 1.85+ (stable, source builds only) and a model
credential for the surface you use (CLAUDE_CODE_OAUTH_TOKEN,
ANTHROPIC_API_KEY, or OPENAI_API_KEY).
On AWS Bedrock, set [model] provider = "bedrock" and authenticate with
AWS_ACCESS_KEY_ID / AWS_SECRET_ACCESS_KEY (+ optional AWS_SESSION_TOKEN)
and AWS_REGION: the agent and the auto-harvester both support it, and can be
pointed at different providers. That's it for chat; Docker, Harbor, and Python
are only needed for benchmark runs.
Wiring host agents
Wire Kimetsu into any supported host. The built-in installers cover Claude Code, Codex, Pi, OpenClaw, and Cursor:
kimetsu plugin install claude --workspace . # writes .mcp.json + .claude/settings.json
kimetsu plugin install codex --workspace . # writes .codex/config.toml + .codex/hooks.json + skill + agent
kimetsu plugin install openclaw --workspace . # MCP server + hooks plugin + skill in .openclaw/ (requires --features openclaw on source builds)
kimetsu plugin install pi --workspace . # TS extension (Pi has no MCP) + skill in .pi/ (requires --features pi on source builds)
kimetsu plugin install cursor --workspace . # writes .cursor/mcp.json + .cursor/rules/kimetsu-brain/rule.md
# Install globally for every project (writes to the host's home config dir):
kimetsu plugin install claude --scope global
# See what's wired where, or remove just the wiring (keeps the binary + brain):
kimetsu plugin status
kimetsu plugin uninstall codex --yes
# Or do init + install + selftest in one shot:
kimetsu setup --host claude-code
# Switched editors? Move your wiring, no reinstall (prebuilt/npm binaries
# include every host; on a source build add `--features pi`):
kimetsu plugin uninstall claude-code --yes # drop the old host's wiring
kimetsu plugin install pi # wire the new oneCursor: this host has no UserPromptSubmit-style hook system, so MCP + an
always-on guidance file are the complete integration surface (no automatic
prompt-time context injection; the model must call kimetsu_brain_context
manually). The config schema for Cursor (.cursor/mcp.json) matches the host's
current official MCP documentation (re-verified June 2026): Cursor uses
mcpServers with type: "stdio", command, and args. The installer merges
non-destructively, so any existing servers are preserved.
--scope defaults to workspace. The installer merges into existing
config: if you already have hooks (even on the same events Kimetsu uses:
UserPromptSubmit, PreToolUse, …), your hooks are kept and Kimetsu's are
added alongside them. Re-running is idempotent and never needs --force.
Now your host agent gets the kimetsu_* MCP tools (brain context, memory
add/list, citations, repo ingest, the cross-harness skill bridge) and starts
banking memories across every session.
Auto-harvest and the SessionEnd distiller
Memories get auto-harvested: when you fix a command that was failing,
or finish a non-trivial session without recording anything, a Kimetsu hook cues
the agent to dispatch a background kimetsu-memory-harvester subagent (a cheap
in-agent distiller) that records the lesson for next time, with no extra API key.
Turn it off with [learning] auto_harvest = false in .kimetsu/project.toml.
For a deterministic harvest that doesn't depend on the agent, kimetsu plugin install claude-code and kimetsu plugin install codex offer to set up a
SessionEnd distiller: a cheap configured model (Anthropic
claude-haiku-4-5, OpenAI gpt-5.4-mini, or a compatible endpoint via
ANTHROPIC_BASE_URL / OPENAI_BASE_URL) that distills each session itself at
the end and records the lessons. Claude Code runs it from SessionEnd; Codex
runs it from the supported Stop hook with --distill-on-stop. The wizard
stores the key in a gitignored .env; skip it with --no-setup. Run it with
--scope global to configure the distiller once in
~/.kimetsu/: it then distills every project's sessions into your user brain
(available everywhere), unless that project has its own distiller.
Retrieval levels
Kimetsu ships a single retrieval knob so you do not have to tune the embedder,
reranker, and query expansion by hand. Pick a level and the pipeline is
configured for you; override it later if you want. Swapping levels is a one-line
change in .kimetsu/project.toml:
[retrieval]
level = "deep" # basic | flexible | deep | advanced | custom| Level | What it does | What it needs | When to use |
|---|---|---|---|
basic | Lexical (FTS) retrieval only. No semantic embeddings, no reranking. | No model and no downloads. | Fastest setup, air-gapped or tiny machines, or when you only want keyword recall. |
flexible | Semantic retrieval, no reranking. | The embedding model (downloaded on first run). | You want semantic recall but want to skip the rerank stage. |
deep | Semantic retrieval plus a cross-encoder reranker. Recommended default. | The embedding model plus the reranker model. Works locally and remote. | The best balance of quality and cost. This is what new projects ship on. |
advanced (Beta) | Everything in deep plus HyDE query expansion. | A capable cheap model. A small 3B local model will not help here: use OpenAI or Anthropic, or a larger Ollama model such as qwen2.5:14b. | Hard or sparse queries where a hypothetical answer improves recall. |
custom | Uses your individual [embedder] settings for manual control. | Whatever you configure. | You want to tune embedder.enabled and embedder.reranker yourself. |
The level resolves into [embedder] at config-load time, so every retrieval
consumer sees the same resolved settings. New projects from kimetsu init ship
on deep. Existing project.toml files that have no [retrieval] section
default to custom, so they keep using their [embedder] values exactly as
before. For advanced, set a cheap model under [cheap_model]; without one,
HyDE falls back to the raw query and prints a one-line note.
Configuration toggles
Every optional feature is turn-off-able in .kimetsu/project.toml:
embeddings ([embedder] enabled), ambient workspace context
([broker] ambient), the global user brain ([kimetsu] use_user_brain),
auto-harvest, the distiller, secret redaction. The precedence is
env override > config > default, and kimetsu config edit opens the file
in $EDITOR and re-validates on save. Re-installing merges, so your toggles
survive.
Maintenance & lifecycle
kimetsu config set embedder.enabled false # flip any toggle (config get reads one)
kimetsu brain export mem.json # move memories between brains (import reads them)
kimetsu brain memory edit <id> --text "…" # fix a recording in place (undo retires the last one)
kimetsu runs prune --older-than 30d # drop old run dirs; brain compact VACUUMs brain.db
kimetsu ps # see running MCP servers; stop clears a stale one
kimetsu uninstall # tiered: binary / + plugin wiring / + brains.kimetsu/ stays lean: just brain.db + project.toml; transient
proactive/chat/bench output lives under ~/.kimetsu/cache/.
For retrieval model selection (embedder/reranker swapping, benchmarking your own corpus), see Retrieval models & benchmarking in HOW-KIMETSU-WORKS.