trovex Compare Answers Setup Request access

How do I reduce the token cost of giving a coding agent context?

Direct answer

Cut the repeated retrieval, not the one-time prompt. Most of a coding agent's context cost is paying again and again to reread the same docs to find which is current. Route each query to the single canonical doc that answers it, read at the section level, and skip stale and duplicate copies. Together that's about 60% fewer tokens per doc lookup.

Where does the cost actually go?

Not into the one big prompt people worry about — into the steady drip of lookups. Every session, an agent reopens the same files to re-establish context: which runbook is current, where the deploy steps live, what a past incident decided. Multiply that by every agent and every teammate and the retrieval bill dwarfs the occasional large prompt. The fix is to make each lookup cheap and final, so it isn't re-paid.

The three levers that move the bill

What drives agent context cost, and the lever that cuts it
Cost driverLever
Rereading files to find the current oneRoute each query to one canonical doc
Loading whole files for a small answerSection-level reads
Paying for stale / duplicate copiesFreshness markers skip them
Re-deriving what another agent already foundWrite-back so the fleet shares one answer

How trovex does it

trovex is an open-source, local-first MCP server built around these levers: one canonical answer per query, section-level reads, freshness markers, and a shared write path. It runs on your machine (SQLite + ONNX, no cloud or keys) and keeps a savings dashboard so the reduction is a number you can see, not a claim you have to trust.

See the tokens you'd stop spending.

Index your repo and let trovex serve one current answer per query.

Open source. No cloud, no API keys. Your docs never leave your machine.