Toggle navigation
OpenReview
.net
Login
×
Go to
DBLP
homepage
Regressing the Relative Future: Efficient Policy Optimization for Multi-turn RLHF
Zhaolin Gao
,
Wenhao Zhan
,
Jonathan Daniel Chang
,
Gokul Swamy
,
Kianté Brantley
,
Jason D. Lee
,
Wen Sun
Published: 01 Jan 2025, Last Modified: 14 May 2025
ICLR 2025
Everyone
Revisions
BibTeX
CC BY-SA 4.0
Loading