PrefDisco: Evaluating Proactive Personalization through Interactive Preference Discovery

Published: 06 Oct 2025, Last Modified: 04 Nov 2025MTI-LLM @ NeurIPS 2025 PosterEveryoneRevisionsBibTeXCC BY-ND 4.0
Keywords: Personalization, Question Asking, Preference Elicitation
TL;DR: We propose PrefDisco, a benchmark for proactive personalization where the model need to ask questions about the user's preferences before providing a personalized answer to the user, revealing systematic model failures in personalization abilities.
Abstract: Current language models struggle to discover user preferences through conversation, often producing responses that mismatch individual needs. We introduce PrefDisco, a meta-benchmark framework that transforms existing benchmarks into interactive personalization tasks using psychologically-grounded personas with consistent preference patterns. Evaluation of 22 frontier models across ten tasks reveals systematic failures. Counterintuitively, 42.6% of model-task combinations perform worse when attempting personalization than providing generic responses. We show that models tend not to ask questions even when provided the option to, even though question asking improves preference alignment. Domain analysis reveals optimization brittleness: mathematical reasoning suffers severe degradation under personalization (3.5% accuracy loss), while social reasoning maintains robustness (3.1% gain). These findings establish interactive preference discovery as a distinct capability requiring dedicated architectural innovations rather than an emergent property of general language understanding.
Submission Number: 173
Loading