Toggle navigation
OpenReview
.net
Login
×
Go to
DBLP
homepage
Joint Reward and Policy Learning with Demonstrations and Human Feedback Improves Alignment
Chenliang Li
,
Siliang Zeng
,
Zeyi Liao
,
Jiaxiang Li
,
Dongyeop Kang
,
Alfredo García
,
Mingyi Hong
Published: 01 Jan 2025, Last Modified: 15 Oct 2025
ICLR 2025
Everyone
Revisions
BibTeX
CC BY-SA 4.0
Loading