Mitigating Unintended Memory Use in LLMs via Structured Memory

Hakeem Hannoon; Andrew Zhao; Mihir Narayan; Sharvin Goyal; Ivaxi Sheth

Mitigating Unintended Memory Use in LLMs via Structured Memory

Hakeem Hannoon, Andrew Zhao, Mihir Narayan, Sharvin Goyal, Ivaxi Sheth

Published: 04 Jun 2026, Last Modified: 04 Jun 2026ICML MemFM 2026 Workshop PosterEveryoneRevisionsBibTeXCC BY 4.0

Keywords: personalisation, sycophancy, memory, cross-domain-leakage

Abstract: Conversational language models increasingly rely on persistent user memories for personalization, creating an inference-time surface for unintended recall of stored user information. While unintended memorization is often studied as training-data extraction, deployed memory systems introduce a parallel privacy risk: models may leak sensitive user details across unrelated contexts or defer sycophantically to remembered preferences. We investigate representation-level mitigations that reorganize the same memory set at inference time into fixed-domain partitions, dynamic-domain partitions, or a two-level memory tree, without changing the model or memory content. On PersistBench across seven frontier models, fixed partitioning reduces cross-domain leakage in six of seven models, while dynamic partitioning improves all seven and lowers leakage by $\sim8\%$ on average relative to the flat baseline while preserving desired personalization. These transformations also stack with some prompt-based defenses. Our work positions structured memory as a practical mitigation for unintended memorization in deployed foundation models, complementing prompt defenses.

Email Sharing: We authorize the sharing of all author emails with Program Chairs.

Data Release: We authorize the release of our submission and author names to the public in the event of acceptance.

Submission Number: 56

Loading