WikiPersonas : what can we learn from personalized alignment to famous people?

WikiPersonas : what can we learn from personalized alignment to famous people?

ACL ARR 2025 February Submission2530 Authors

14 Feb 2025 (modified: 09 May 2025)ACL ARR 2025 February SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Abstract: Preference alignment has become a standard pipeline in finetuning models to follow *generic* human preferences. Majority of work seeks to optimize model to produce responses that would be preferable *on average*, simplifying the diverse and often *contradicting* space of human preferences. While research has increasingly focused on personalized alignment: adapting models to individual user preferences, there is a lack of personalized preference dataset which focus on nuanced individual-level preferences. To address this, we introduce WikiPersona: the first fine-grained personalization using well-documented, famous individuals. Our dataset challenges models to align with these personas through an interpretable process: generating verifiable textual descriptions of a persona’s background and preferences in addition to alignment. We systematically evaluate different personalization approaches and find that as few-shot prompting with preferences and fine-tuning fail to simultaneously ensure effectiveness and efficiency, using *inferred personal preferences* as prefixes enables effective personalization, especially in topics where preferences clash while leading to more equitable generalization across unseen personas.

Paper Type: Long

Research Area: Resources and Evaluation

Research Area Keywords: personalized preference alignment, personalization, preference alignment

Contribution Types: Approaches low compute settings-efficiency, Data resources, Data analysis

Languages Studied: English

Submission Number: 2530

Loading