KnowMe-Bench: Benchmarking Person Understanding for Lifelong Digital Companions

KnowMe-Bench: Benchmarking Person Understanding for Lifelong Digital Companions

ACL ARR 2026 January Submission2687 Authors

03 Jan 2026 (modified: 20 Mar 2026)ACL ARR 2026 January SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Large Lauguage Model, Lifelong Benchmark

Abstract: Existing long-horizon memory benchmarks mostly use multi-turn dialogues or synthetic user histories, which makes retrieval performance an imperfect proxy for person understanding. We present Knowme-Bench, a publicly releasable benchmark built from long-form autobiographical narratives, where actions, context, and inner thoughts provide dense evidence for inferring stable motivations and decision principles. Knowme-Bench reconstructs each narrative into a flashback-aware, time-anchored stream and evaluates models with evidence-linked questions spanning factual recall, subjective state attribution, and principle-level reasoning. Across diverse narrative sources, retrieval-augmented systems mainly improve factual accuracy, while errors persist on temporally grounded explanations and higher-level inferences, highlighting the need for memory mechanisms beyond retrieval.

Paper Type: Long

Research Area: Resources and Evaluation

Research Area Keywords: Information Extraction

Contribution Types: Data resources

Languages Studied: English

Submission Number: 2687

Loading