Meet Dynamic Individual Preferences: Resolving Conflicting Human Value with Paired Fine-Tuning

Meet Dynamic Individual Preferences: Resolving Conflicting Human Value with Paired Fine-Tuning

ACL ARR 2026 January Submission8310 Authors

06 Jan 2026 (modified: 20 Mar 2026)ACL ARR 2026 January SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Preference alignment, Personalization, Conflicting preferences, Supervised fine-tuning

Abstract: Recent advances in large language models (LLMs) have significantly improved the alignment of models with general human preferences. However, a major challenge remains in adapting LLMs to individual preferences, which are not only diverse but also dynamic. In this paper, we introduce a novel framework, **Preference-Paired Fine-Tuning (PFT)**, designed to align models with contradictory and evolving individual preferences. We present a new dataset, **Value Conflict Dilemma (VCD)**, which includes scenarios that involve conflicting human preferences, facilitating the evaluation of our approach. Our experiments demonstrate that PFT outperforms single-preference training methods, achieving up to 96.67\% accuracy in multi-choice classification tasks and the highest open-ended generation score of 8.69. PFT also shows significant improvements over DPO, SFT and some traditional training methods, especially when handling conflicting preferences. Additionally, with limited user history data, models can inferring preference vector rapidly, achieving a 44.76\% improvement in user-specific preference alignment in comparison to single-preference models. Code will be released soon.

Paper Type: Long

Research Area: Safety and Alignment in LLMs

Research Area Keywords: safety and alignment, safety and alignment for agents

Contribution Types: NLP engineering experiment, Data resources

Languages Studied: English

Submission Number: 8310

Loading