Learning From Multi-Expert Demonstrations: A Multi-Objective Inverse Reinforcement Learning Approach

22 Sept 2023 (modified: 11 Feb 2024)Submitted to ICLR 2024EveryoneRevisionsBibTeX
Primary Area: reinforcement learning
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Keywords: Inverse reinforcement learning, multiple experts, multi-objective learning
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2024/AuthorGuide.
Abstract: Imitation learning (IL) from a single expert's demonstration has reached expert-level performance in many Mujoco environments. However, real-world environments often involve demonstrations from multiple experts, resulting in diverse policies due to varying preferences among demonstrators. We propose a multi-objective inverse reinforcement learning (MOIRL) approach that utilizes demonstrations from multiple experts. This approach shows transferability to different preferences due to the assumption of a common reward among demonstrators. We conducts experimental testing in a discrete environment Deep Sea Treasure (DST) and achieved a promising preliminary result. Unlike IRL algorithms, we demonstrate that this approach is competitive across various preferences in both continuous DST and Mujoco environments, using merely a single model within the SAC framework instead of $n$ models for each distinct preference.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors' identity.
Supplementary Material: zip
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 5855
Loading