Preference-Strength-Aware Self-Improving Alignment with Generative Preference Models

Yuanzhao Zhai, Zhuo Zhang, Cheng Yang, Kele Xu, Yue Yu, Wei Li, Hui Wang, Zenglin Xu, Dawei Feng, Bo Ding, Huaimin Wang

Published: 13 Jul 2025, Last Modified: 23 Jan 2026CrossrefEveryoneRevisionsCC BY-SA 4.0
Loading