Recalling Too Well: Sycophancy and Bias Amplification in Memory-Augmented Models
Keywords: large language models, LLMs, persistent memory systems, memory, memory systems, sycophancy, AI alignment, bias, LLM agents, benchmarking, moral judgment, creative generation, user preferences
Abstract: Persistent memory systems promise to make LLMs more helpful by learning user preferences over time. We show they also make models less correct and less creative, due to systematically biasing outputs through over-alignment to user beliefs. We conduct the first systematic evaluation of sycophancy and bias in memory-augmented agents, testing three state-of-the-art systems (Mem0, MemOS, Zep) across scientific reasoning, moral judgment, and creative generation benchmarks. Memory systems amplify sycophantic behavior across all domains, showing 2-4x higher strict sycophancy rates than chat history baselines in scientific questions and increased user-affirming judgments in moral reasoning tasks. We identify a novel failure mode where memory retrieval causes models to inappropriately anchor creative outputs on irrelevant preferences expressed previously and stored in memory, achieving 87-91% alignment with user preferences compared to 47-55% in chat history baselines. Finally, we benchmark prompt-based mitigation strategies as a potential intervention.
PDF: pdf
Email Sharing: We authorize the sharing of all author emails with Program Chairs.
Data Release: We authorize the release of our submission and author names to the public in the event of acceptance.
Submission Number: 115
Loading