Updated
Cut your team's AI agent token bill at scale
At team scale the bill is mostly the same waste multiplied: every agent, on every session, rereads candidate docs to re-find one answer. Cut the rereading once, centrally, by serving the single canonical doc per query instead of the pile. trovex models about 60% fewer tokens per doc lookup (measured at equal task-success, median 69% across 26 queries), and the win scales with team size.
The bill is rereading, multiplied by your headcount
One developer's agent rereading the same docs is a rounding error. A whole team doing it, every session, is the bill. The tokens go to re-finding which of several near-duplicate docs is current, not to the actual work. Adding seats scales the waste linearly. The fix is to cut the rereading at the source, once, for everyone: resolve a question to one canonical doc with a freshness marker instead of dumping the top candidates into every agent's context.
Cut it once, centrally, not per seat
trovex indexes your markdown and serves the one doc that answers a query, locally, over MCP, so every client on the team (Claude Code, Cursor, Windsurf, Cline) reads the same canonical answer instead of each rebuilding it. On our own repo that measured a median of 69% fewer tokens per lookup at equal task-success across 26 queries, range 41 to 81%; we publish about 60% as the conservative floor. Run uvx trovex bench to get your team's own number.
Where trovex fits, and where the consulting does
trovex is the shared-context layer and you can install it today. If the token bill is a symptom of a deeper operating problem, how the team runs agents in production, that is what tsukumo consults on. trovex is the tool; the operating model is the engagement. Here is what getting help running agent fleets in production looks like.
FAQ
Where does a team's agent token spend actually go?
Mostly to rereading. Every agent, on every session, reopens the top candidate docs to work out which one is current, then throws most of them away. Multiply that across a team and the bill is dominated by re-finding answers the fleet already had, not by generating new work.
Is this a per-seat fix or a central one?
Central. Buying more seats multiplies the same waste. The lever is to serve one canonical doc per query from a shared store every agent reads, so the rereading is cut once for the whole team instead of paid per developer.
How much does it save, and how do we measure it on our repos?
On our own repo it measured a median of 69% fewer tokens per lookup at equal task-success across 26 pre-registered queries, range 41 to 81%. We headline a conservative about 60%. Run the benchmark on your repo at trovex.dev/measure for your own number.
Won't a bigger context window cut the cost instead?
No. A bigger window lets each agent hold more, but it still rereads candidates to find the current doc and still pays for them. What cuts the cost is serving one canonical doc per query, not buying more context to sift.
Measure it on your own repo.
Index a folder of markdown, serve your agents one current answer per query, and see the tokens saved.
uv tool install trovex
Open source. No cloud, no API keys. Your docs never leave your machine.