Learning From Diverse Experts: Behavior Alignment Through Multi-Objective Inverse Reinforcement Learning

Published: 06 Mar 2025, Last Modified: 05 May 2025ICLR 2025 Bi-Align Workshop PosterEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Diverse preference alignment, Specifying human objectives, Reward hacking and modeling, Annotation of human values
TL;DR: Our work presents a framework based on Multi-Objective IRL, which learns from expert demonstrations with very little, possibly noisy knowledge about those preferences, while efficiently transferring to unseen preferences.
Abstract: Imitation learning (IL) from demonstrations serves as one data-efficient and practical framework for achieving human-level performance and behavior alignment with human experts in sequential decision making. However, existing IL approaches mostly presume that the expert demonstrations are homogeneous and largely ignore the practical issue of multiple performance criteria and the resulting diverse preferences of the experts. To tackle this, we propose to learn simultaneously from multiple experts of different preferences through the lens of multi-objective inverse reinforcement learning (MOIRL). Specifically, MOIRL achieves unified learning from diverse experts by inferring the vector-valued reward function of each expert and reconcile these via \textit{reward consensus}. Built on this, we propose Multi-Objective Inverse Soft-Q Learning (MOIQ), which penalizes differences in the rewards for encouraging reward consensus. This approach enjoys transferability to unseen preferences due to the reward consensus among demonstrators. To further annotate the unknown preferences of demonstrations, we introduce a posterior network that can predict preferences of the given trajectories. Extensive experiments demonstrate that MOIQ is competitive in challenging scenarios with low and noisy annotations and can outperform stronger benchmark methods and approaches expert-level performance in the fully annotated regime.
Submission Type: Long Paper (9 Pages)
Archival Option: This is a non-archival submission
Presentation Venue Preference: ICLR 2025
Submission Number: 22
Loading