Toggle navigation
OpenReview
.net
Login
×
Go to
DBLP
homepage
ReMax: A Simple, Effective, and Efficient Reinforcement Learning Method for Aligning Large Language Models
Ziniu Li
,
Tian Xu
,
Yushun Zhang
,
Zhihang Lin
,
Yang Yu
,
Ruoyu Sun
,
Zhi-Quan Luo
Published: 01 Jan 2024, Last Modified: 19 Jan 2025
ICML 2024
Everyone
Revisions
BibTeX
CC BY-SA 4.0
Loading