Synthetic Preference Interpolation for Language Model Alignment

ACL ARR 2024 December Submission692 Authors

15 Dec 2024 (modified: 05 Feb 2025)ACL ARR 2024 December SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Abstract: Ensuring alignment with human preferences is a crucial characteristic of large language models (LLMs). Presently, the widely-used DPO-based alignment methods achieving good alignment performance using preference pair for training. However, we pointed out that, the rich information of preference data is not fully exploited in the existed pair-wise training methods, which still calls for efficient solutions. To address this limitation and obtain stronger alignment performance, this work introduces a novel approach called Synthetic Preference Interpolation Alignment ($\textbf{SPIA}$), which generates interpolated synthetic preference data that represents intermediate quality between the chosen and rejected samples. We conduct extensive experiments and evaluations on both annotated preference data and self-play preference data, demonstrating that our method achieves strong alignment performance.
Paper Type: Long
Research Area: Language Modeling
Research Area Keywords: fine-tuning,data augmentation,text-to-text generation,NLP datasets
Contribution Types: Data resources, Data analysis, Theory
Languages Studied: English
Submission Number: 692
Loading