Hierarchical Episodic Memory in LLMs via Multi-Scale Event Organization

Martin Benfeghoul; Haitham Bou Ammar; Jun Wang; Zafeirios Fountas

Hierarchical Episodic Memory in LLMs via Multi-Scale Event Organization

Martin Benfeghoul, Haitham Bou Ammar, Jun Wang, Zafeirios Fountas

Published: 05 Mar 2025, Last Modified: 20 Apr 2025NFAM 2025 PosterEveryoneRevisionsBibTeXCC BY 4.0

Track: long paper (up to 5 pages)

Keywords: large language models, long context, retrieval, episodic memory, event cognition, training-free, hierarchical memory

Abstract: A major limitation of contemporary large language models (LLMs) is their significant performance degradation when processing long contexts, primarily due to self-attention dilution and context window limitations. Recent work on retrieval-augmented LLMs has shown that integrating formation and retrieval of human-inspired episodic memory (a form of associative memory) into Transformers, via an architecture termed EM-LLM, enables pre-trained models to process up to 10M tokens while consistently outperforming their full-context versions using only a fraction of the computational resources. A crucial feature of EM-LLM is the segmentation of the model's KV-cache into human-like events based on token-level surprise. However, this approach overlooks the hierarchical nature of human episodic memory, which exhibits nested timescale organization across multiple levels of abstraction. Here, we introduce two novel head-level event segmentation methods that leverage the inherent hierarchical processing in Transformer layers, combining similarity-based boundary detection with coordinated event hierarchies. Our experiments suggest that these structures are not only likely to improve retrieval performance but also show patterns consistent with the nested event hierarchies observed in human cognition, providing both practical advances in LLM capabilities and insights into memory organization across artificial and biological systems.

Submission Number: 20

Loading