Learning then Leveraging Structures Help with Complex, Compositional, Causal Sequential Tasks in Inverse Reinforcement Learning

20 Sept 2023 (modified: 25 Mar 2024)ICLR 2024 Conference Withdrawn SubmissionEveryoneRevisionsBibTeX
Keywords: Reinforcement learning, inverse reinforcement learning, belief propagation
TL;DR: We propose a novel IRL method that learns approximate causal motifs as a FSA, then uses the motif to solve the IRL problem.
Abstract: The field of Inverse Reinforcement Learning (IRL) has experienced substantial advancements in recent years, with commendable newer approaches yielding crucial applications in diverse areas such as robotics, cognition, and healthcare. This paper underscores the limitations of foundational IRL methods when learning an agent’s reward function from *expert trajectories that have underlying causal reward structure*. We posit that imbuing IRL models with causal structural motifs capturing underlying relationships between actions and outcomes or, the reward logic can enable and enhance their performance. Based on this hypothesis, we propose SMIRL – an IRL approach that initially learns the task’s structure as a finite-state-automaton (FSA) and subsequently leverages this structural motif to solve the IRL problem. We demonstrate SMIRL’s capabilities in both discrete (grid world) and high-dimensional continuous domain environments across four logic based tasks. The SMIRL approach proves adept at learning tasks characterized by causal reward functions, a known limitation of foundational IRL approaches.Our model also outperforms the baselines in sample efficiency on tasks. We further show promising test results in a modified continuous domain on tasks with compositional reward functions.
Supplementary Material: zip
Primary Area: reinforcement learning
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2024/AuthorGuide.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors' identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 2714
Loading