Keywords: Agent Memory Evolution, Cognition-Inspired, LLM agents
Abstract: Large language model (LLM) agents require long-term user memory for consistent personalization, but limited context windows hinder tracking evolving preferences over long interactions. Existing memory systems mainly rely on static, hand-crafted update rules; although reinforcement learning (RL)-based agents learn memory updates, sparse outcome rewards provide weak supervision, resulting in unstable long-horizon optimization. Drawing on memory schema theory and the functional division between prefrontal regions and hippocampus regions, we introduce MemCoE, a cognition-inspired two-stage optimization framework that learns how memory should be organized and what information to update. In the first stage, we propose Memory Guideline Induction to optimize a global guideline via contrastive feedback interpreted as textual gradients; in the second stage, Guideline-Aligned Memory Policy Optimization uses the induced guideline to define structured process rewards and performs multi-turn RL to learn a guideline-following memory evolution policy. We evaluate on three personalization memory benchmarks, covering explicit and implicit preferences as well as different sizes and noise levels, and observe consistent improvements over strong baselines with favorable robustness, transferability, and efficiency. The codes are available at https://anonymous.4open.science/r/MemoryEvolve/.
Paper Type: Long
Research Area: AI/LLM Agents
Research Area Keywords: agent memory,LLM agents
Contribution Types: Model analysis & interpretability, NLP engineering experiment, Reproduction study, Approaches to low-resource settings
Languages Studied: English
Submission Number: 3384
Loading