Robust Inverse Reinforcement Learning Through Bayesian Theory of Mind

Published: 20 Jun 2023, Last Modified: 05 Jul 2023ToM 2023EveryoneRevisionsBibTeX
Keywords: Inverse Reinforcement Learning, Bayesian Theory of Mind, Robust Control
Abstract: We consider the Bayesian theory of mind (BTOM) framework for learning from demonstrations via inverse reinforcement learning (IRL). The BTOM model consists of a joint representation of the agent’s reward function and the agent's internal subjective model of the environment dynamics, which may be inaccurate. In this paper, we make use of a class of prior distributions that parametrize how accurate is the agent’s model of the environment to develop efficient algorithms to estimate the agent's reward and subjective dynamics in high-dimensional settings. The BTOM framework departs from existing offline model-based IRL approaches by performing simultaneous estimation of reward and dynamics. Our analysis reveals a novel insight that the estimated policy exhibits robust performance when the (expert) agent is believed (a priori) to have a highly accurate model of the environment. We verify this observation in the MuJoCo environment and show that our algorithms outperform state-of-the-art offline IRL algorithms.
Supplementary Material: pdf
Submission Number: 24
Loading