YNTP-100: A Benchmark for Your Next Token Prediction with 100 People

YNTP-100: A Benchmark for Your Next Token Prediction with 100 People

ACL ARR 2026 January Submission7540 Authors

06 Jan 2026 (modified: 20 Mar 2026)ACL ARR 2026 January SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: large language models, personalized alignment, next-token prediction, personalized response generation

Abstract: Large language models (LLMs) trained for general \textit{next-token prediction} often fail to generate responses that reflect how specific individuals communicate. Progress on personalized alignment is further limited by the difficulty of collecting real-world personal communication data due to privacy constraints. We propose \textbf{Your Next Token Prediction (YNTP)}, a task that formulates personalized response generation as token-level prediction conditioned on user interaction history. We introduce \textbf{YNTP-100}, a benchmark built from controlled multi-day human--agent conversations with 100 participants, enabling systematic evaluation of user-specific response behavior. We evaluate prompting-based and parameter-updating alignment methods using metrics of content alignment and stylistic consistency, establishing the first benchmark for YNTP. The dataset and code are publicly available at: https://github.com/AnonymousHub4Submissions/YNTP100.

Paper Type: Long

Research Area: Safety and Alignment in LLMs

Research Area Keywords: Language Modeling,

Contribution Types: Data resources

Languages Studied: English, Chinese, Japanese

Submission Number: 7540

Loading