The Pensieve Paradigm: Stateful Language Models with Learned Memory Management

ICLR 2026 Conference Submission97 Authors

01 Sept 2025 (modified: 23 Dec 2025)ICLR 2026 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: LLM, memory management
Abstract: In the world of Harry Potter, when Dumbledore's mind is overburdened, he extracts memories into a Pensieve to be revisited later. In the world of AI, while we possess the Pensieve—mature databases and retrieval systems, our models inexplicably lack the "wand" to operate it. They remain like a Dumbledore without agency, passively accepting a manually engineered context as their entire memory. This work finally places the wand in the model's hand. We introduce StateLM, a new class of foundation models endowed with an internal reasoning loop to manipulate their own state. We equip our model with a suite of tools, such as dynamic indexing, context pruning, and note-taking, and train it to actively manage this loop. By learning to dynamically construct its own context, our model breaks free from the architectural prison of a fixed window. The results are prominent: our state-management approach decouples performance from context window size, delivering strong accuracy and sustainability under extremely long contexts with linear inference cost. We demonstrate this by showing StateLM reliably retrieves a "needle" from a 1-million-token haystack, a task far beyond the reach of conventional models. On practical document QA tasks from NovelQA and $\infty$Bench, StateLM outperforms strong instruct baselines while using only 1/4 of their active context. An ablation further shows that our curated training pipeline is more effective for learning memory management than agent-like prompting. Together, these results mark a shift from passive predictors to state-aware systems where reasoning becomes a stateful and manageable process.
Supplementary Material: zip
Primary Area: foundation or frontier models, including LLMs
Submission Number: 97
Loading