Inverse Reinforcement Learning of Interactive Scenarios

Yue Mao; Shicheng Liu; Siyuan Xu; Minghui Zhu

Inverse Reinforcement Learning of Interactive Scenarios

Yue Mao, Shicheng Liu, Siyuan Xu, Minghui Zhu

19 Sept 2025 (modified: 11 Feb 2026)Submitted to ICLR 2026EveryoneRevisionsBibTeXCC BY 4.0

Keywords: Inverse Reinforcement Learning

Abstract: This paper studies the problem where a learner aims to learn the reward function of an expert and a policy to interact with the expert from interactions with the expert. We formulate the problem as a stochastic bi-level optimization problem where the lower level learns a reward function that explains the behaviors of the expert, and the upper level learns a policy to interact with the expert. We develop a double-loop algorithm, General Scenario Interactive Inverse Reinforcement Learning (GSIIRL), which solves the lower-level optimization problem in the inner loop and the upper-level optimization problem in the outer loop. We formally guarantee that GSIIRL converges at the rate of $\mathcal{O}(\frac{1}{\sqrt{K}})$ and empirically validate our algorithm through simulations.

Supplementary Material: zip

Primary Area: reinforcement learning

Submission Number: 15664

Loading