Keywords: LLM, Agent, Memory
Abstract: Long-horizon LLM agents must carry state across many tool interactions, yet naïve context extension via periodic
summarization or top-$k$ retrieval can discard decisive evidence and makes failures hard to audit.
We introduce \textbf{HELM} (\textbf{H}ierarchical \textbf{E}pistemic \textbf{L}earned \textbf{M}emory), a framework that
exposes memory as an explicit, event-driven interface and couples memory access with \emph{epistemic governance}.
HELM instantiates a three-tier nested store, \textbf{SHNM}, that links episodic traces to consolidated recalls and
thematic indices via provenance edges and epistemic metadata (timestamps, source types, tool status).
Governance makes memory operations reproducible: retrieval is re-ranked with recency/status-aware scoring and conflict
resolution prefers verified, newer evidence, while provenance expansion can trace any recall back to concrete tool
spans.
On top of SHNM, a learned controller decides when to read, write, consolidate, and prune under task and efficiency
budgets, and a tool-aware embedding model indexes tool-augmented trajectories to improve retrieval of procedural and
trace-based memories.
We evaluate on five long-horizon benchmarks and report diagnostics that jointly measure end-task performance, memory
efficiency, and epistemic reliability, including auditable recall metrics that quantify provenance faithfulness.
Paper Type: Long
Research Area: AI/LLM Agents
Research Area Keywords: Agent,LLM,Memory
Contribution Types: NLP engineering experiment, Approaches to low-resource settings, Approaches low compute settings-efficiency
Languages Studied: English
Submission Number: 9591
Loading