Cut LLM tokens by up to 94%.

One install closes four fronts of token waste: what the model writes, what your shell dumps in, what it reads, and the catalogs & memory loaded every session. Without losing a single identifier.

Speak less. Ship more. — Spartan replies for AI agents.

latestv0.16.2· 2026-06-02release notes →

$npm install -g lakonai && lakonai install

github →npm →

The story behind the name

Where the word laconic came from — and why your agent shouldn't open with “Great question!”.

In 346 BC, Philip II of Macedon — father of Alexander the Great — sent the Spartans a message:

“If I invade Lakonía, I will raze your cities to the ground.”

The Spartans replied with a single word:

“If.”

That region was Lakonía. Its people gave the English language the word laconic — using as few words as possible. They didn't waste breath, didn't waste arrows, didn't waste anything.

Your AI coding agent does. It opens with “Sure! I'd be happy to help…”, repeats your question back, and explains what the diff already shows. Every wasted token is a soldier you didn't need to send.

lakonai trims both sides.

What lakonai does

Most tools stop at one front. lakonai works all four — transparently, and gets better the more you use it: it auto-learns new heavy commands and turns on a safe filter for them, no config.

output

Terse the model

Installs a Spartan response rule. No preamble, no restating, no recap. Fragments are fine. Identifiers, paths, and errors stay verbatim.

input

Filter the shell

Compresses 30 commands (git/ls/grep/find/test runners/ lint/docker) before they hit context — via a Claude hook, or the lakonai shim on every other agent.

reads

Block junk reads

Denies 30+ junk paths (node_modules/, lockfiles, build artifacts) and caps huge files — on both the Claude Read tool and shell cat/head/tail. Grep capped at head_limit=30.

context

Shrink the catalogs

Auto-wraps each MCP server so its tool/prompt/resource descriptions are compressed before they reach context. Offline, reversible, opt-out LAKON_NO_MCP=1.

memory

Compress your CLAUDE.md

lakonai compress-memory <file> shrinks authored memory via a local AI CLI you already have. No API key. Backed up + validated byte-for-byte; steer it with a freeform instruction.

universal

Every agent, one shim

lakonai shim routes shell commands through lakonai at the PATH level — no hook API, no model cooperation. Shell filtering becomes automatic on Codex, Cursor, Windsurf, Cline, Gemini too.

learn

Auto-learn new commands

A heavy, frequent command gets promoted to the filtered set automatically — learned from ~/.lakon/log.jsonl on any agent. Throttled hourly; opt-out LAKON_NO_LEARN=1.

measure

See both sides

lakonai gain shows input savings (measured, deterministic) and output savings (estimated by your local AI CLI, no key). lakonai upgrade keeps you current.

Real savings

Conservative numbers from real commands — peaks go higher in practice. Run lakonai inspect <cmd> on your own machine to measure.

commandtokens (raw → filtered)delta

git log -p -1010,497 → 78-94%

ls -laR (deep dir)23,624 → 117-94%

npm test (passing)4,451 → 358-92%

git diff HEAD~513,230 → 798-89%

Read pnpm-lock.yaml~56,000 → blocked-95%

Grep (head_limit auto)unbounded → 30 matchescapped

Use the filter directly

The CLI works standalone too. Prefix any command with lakonai and read only what matters.

lakonai git status        # compressed status
lakonai git log -50       # one line per commit
lakonai git diff          # only +/- lines
lakonai ls -la            # size + name only
lakonai grep -r foo src/  # truncates at 30 matches

Works with every major agent

One install auto-detects what you have and configures the global agents (Claude Code, Codex, Gemini) in ~/ — never touches your current directory. For per-project rules (Cursor, Windsurf, Cline), add --here from inside the repo.

Claude Code support covers every frontend — terminal, VS Code, JetBrains, desktop — they share the same ~/.claude/ config.

Claude Code (CLI + IDE)CodexCursorWindsurfClineGemini CLI