CloneMem: Benchmarking Long-Term Memory for AI Clones

Sen Hu; Zhiyu Zhang; YUXIANG WEI; Xueran Han; Zhenheng Tang; Ronghao Chen; Huacan Wang

CloneMem: Benchmarking Long-Term Memory for AI Clones

Sen Hu, Zhiyu Zhang, YUXIANG WEI, Xueran Han, Zhenheng Tang, Ronghao Chen, Huacan Wang

Published: 03 Mar 2026, Last Modified: 25 Apr 2026ICLR 2026 Workshop MemAgentsEveryoneRevisionsBibTeXCC BY 4.0

Keywords: agent memory, agent evaluation

TL;DR: We introduce CloneMem, a benchmark for evaluating AI clones' long-term memory using digital traces (diaries, social media) to assess tracking of experiences, emotions, and evolving opinions over time.

Abstract: AI Clones aim to simulate an individual’s thoughts and behaviors to enable long-term, personalized interaction, placing stringent demands on memory systems to model experiences, emotions, and opinions over time. Existing memory benchmarks primarily rely on user–agent conversational histories, which are temporally fragmented and insufficient for capturing continuous life trajectories. We introduce CloneMem, a benchmark for evaluating long-term memory in AI Clone scenarios grounded in non-conversational digital traces, including diaries, social media posts, and emails, spanning one to three years. CloneMem} adopts a top-down data construction framework to ensure longitudinal coherence and defines tasks that assess an agent’s ability to track evolving personal states. Experiments show that current memory mechanisms struggle in this setting, highlighting open challenges for life-grounded personalized AI. Code and dataset are available at https://anonymous.4open.science/r/CloneMem-C6E1

Submission Number: 26

Loading