Updated
Is AI agent memory any good?
Not in real group settings. A 2026 benchmark (arXiv:2605.14498) found the strongest memory system scored 46.0% average accuracy (knowledge-update 27.1%, term-ambiguity 37.7%), and a plain BM25 keyword search matched or beat most. Memory is a retrieval-quality problem, not a vector database you bolt on.
Memory is a retrieval problem, not a storage one
The pitch is a vector database that remembers everything. In a real group setting that is not what you get. A 2026 benchmark (GroupMemBench) put memory systems into multi-user chats and the strongest scored 46.0% average accuracy, dropping to 27.1% on knowledge updates and 37.7% on ambiguous terms. A plain BM25 keyword search matched or beat most of them. What fails is retrieval precision, returning the one right item, not storage capacity. Spend on returning the correct context, not on a fancier place to keep it. The numbers are the paper's, on its benchmark.
FAQ
Is AI agent memory reliable?
Not in realistic multi-user settings. A 2026 benchmark found the best memory system scored 46.0% average accuracy, with knowledge-update at 27.1%. A 1:1 demo hides this; a real group chat exposes it.
Does a vector database give agents good memory?
Not on its own. In the benchmark a plain BM25 keyword search matched or beat most vector-based memory stacks. Storing more is not the problem; returning the one correct item is. Memory is a retrieval-quality problem.
Why does agent memory fail in group settings?
Multiple users, updated facts, and ambiguous terms break naive recall. The benchmark measured 27.1% on knowledge updates and 37.7% on ambiguous terms: the system returns stale or plausible-but-wrong context instead of the current right answer.
What should I spend on for agent memory?
Retrieval precision, not storage sophistication. A search that returns the right item beats a vector stack that returns plausible-but-wrong context. Treat it as a retrieval problem: rank for the correct, current answer and verify what comes back.
Give your agents context you can trust.
trovex serves your agents one canonical doc per query, current and inspectable, so the context they read is the right one, not a plausible guess from a vector store. Open source, local, about a minute to set up.
uv tool install trovex
Open source. No cloud, no API keys. Your docs never leave your machine.