Toggle navigation
OpenReview
.net
Login
×
Go to
DBLP
homepage
Robust Preference Optimization through Reward Model Distillation
Adam Fisch
,
Jacob Eisenstein
,
Vicky Zayats
,
Alekh Agarwal
,
Ahmad Beirami
,
Chirag Nagpal
,
Peter Shaw
,
Jonathan Berant
Published: 01 Jan 2025, Last Modified: 23 Jun 2025
Trans. Mach. Learn. Res. 2025
Everyone
Revisions
BibTeX
CC BY-SA 4.0
Loading