Updated 20 June 2026

What does context cost your coding agents? Measured, by scenario.

Direct answer

In trovex's savings model, 40 to 65% of the tokens an agent spends on a doc lookup go to rereading candidate files to work out which one is canonical. The common case, a few small-to-mid docs, lands near 60%. The number tracks repo shape, how many docs the agent would read and how big they are, not which agent or client you use.

The measured cost, by scenario

Without trovex, an agent globs and greps, then reads the top candidate .md files to triage which is current. With trovex it reads one canonical doc plus an ~80-token pointer. The saving is the difference. Each row below is computed from trovex's open-source model (saved = would-have-read − one canonical doc − 80; assumptions shown), not a marketing estimate.

Token cost of a single doc lookup, by repo shape. Per-doc sizes are typical markdown; the response is trovex's ~80-token pointer.
Scenario (candidates × size)	Without trovex	With trovex	Tokens saved / lookup
3 small docs (~400 ea)	1,200	480	720 (60%)
2 small docs (~400 ea)	800	480	320 (40%)
3 mid docs (~900 ea)	2,700	980	1,720 (64%)
2 mid docs (~900 ea)	1,800	980	820 (46%)
3 large docs (~1,800 ea)	5,400	1,880	3,520 (65%)
2 large docs (~1,800 ea)	3,600	1,880	1,720 (48%)

Two honest reads of this table. The saving is real and large whenever the agent would have opened more than one candidate, which is the normal case on a repo with overlapping docs. And it is a range, not a single magic number: a repo where agents only ever read two small files saves less than one where they sift three big ones. The ~60% trovex quotes is the middle of that range, not the ceiling.

Why does an agent reread files at all?

Because nothing tells it which doc is authoritative. When several files mention the same topic, the agent has no signal for which is current, so it opens them and reasons over the pile, every session, for every agent. A 2026 ETH Zurich study (Evaluating AGENTS.md, Gloaguen et al.) found the obvious fix backfires: adding repository context files reduced task success and raised inference cost by over 20% versus no repo context. More context, dumped in, makes the agent read more and wander more. The win is the right slice per task, not a longer file.

Doesn't a bigger context window fix it?

No. A bigger window lets the agent read more files; it does not stop the rereading. You still pay per token for every candidate it opens to guess which doc is current, and a longer prompt can bury the few lines that matter (the "lost in the middle" effect). The lever is cutting the rereads, not enlarging the window. AI crawlers and agents alike are token-bound; the cost is in what they read, not what they could hold.

How trovex cuts it

trovex is an open-source, local-first MCP server that serves agents one canonical answer per query, a path:line pointer with a freshness marker (canonical, stale, or duplicate), down to the section, instead of a candidate list to sift. The same answer goes to any MCP client, so the saving above holds whether you run Claude Code, Cursor, Windsurf, Cline, or Zed. It runs on your machine (vectors in SQLite, ONNX embeddings, no cloud or API keys) and reports the tokens it saved, so the number on your repo is measured, not assumed.

$ uv run trovex index /path/to/repo

FAQ

How much does context cost a coding agent?

In trovex's model, 40 to 65% of the tokens spent on a doc lookup go to rereading candidate files to work out which one is canonical. The common case (a few small-to-mid docs) lands near 60%. The exact figure depends on how many files the agent would otherwise read and how big they are.

Does the saving depend on which coding agent or client I use?

No. trovex serves the same one canonical answer per query whether the client is Claude Code, Cursor, Windsurf, Cline, or another MCP client, so the token saving is client-agnostic. What changes the number is the repo shape (how many candidate docs exist and how large they are), not the agent.

Does a bigger context window make this cheaper?

No. A bigger window lets the agent read more files; it does not stop it rereading them. You still pay per token for every candidate it opens to guess which doc is current. Cutting the rereads, not enlarging the window, is what cuts the cost.

How is the ~60% number measured?

saved = tokens the agent would have read (the top candidate files) minus the one canonical doc trovex returns minus the ~80-token pointer response. ratio = saved / would-have-read. The table on this page shows the computation for each scenario; the model is trovex's open-source savings.py.

Measure it on your own repo.

Point trovex at your docs, serve your agents one current answer per query, and read the tokens it saves you.

get started See how it works

Open source. No cloud, no API keys. Your docs never leave your machine. Running agents in production at scale? tsukumo (the team behind trovex) helps teams operate them.