Distributional Inverse Reinforcement Learning

Distributional Inverse Reinforcement Learning

ICLR 2026 Conference Submission13557 Authors

18 Sept 2025 (modified: 08 Oct 2025)ICLR 2026 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Inverse Reinforcement Learning, Neuroscience, Reinforcement Learning, Robotics

TL;DR: We develop a distributional offline IRL framework that infers reward distributions and risk-sensitive policies via stochastic dominance and distortion risk measures, enabling state-of-the-art performance on synthetic, neural, and MuJoCo benchmarks.

Abstract: We propose a distributional framework for offline Inverse Reinforcement Learning (IRL) that jointly models uncertainty over reward functions and full distributions of returns. Unlike conventional IRL approaches that recover a deterministic reward estimate or match only expected returns, our method captures richer structure in expert behavior, particularly in learning the reward distribution, by minimizing first-order stochastic dominance (FSD) violations and thus integrating distortion risk measures (DRMs) into policy learning, enabling the recovery of both reward distributions and distribution-aware policies. This formulation is well-suited for behavior analysis and risk-aware imitation learning. Empirical results on synthetic benchmarks, real-world neurobehavioral data, and MuJoCo control tasks demonstrate that our method recovers expressive reward representations and achieves state-of-the-art imitation performance.

Primary Area: reinforcement learning

Submission Number: 13557

Loading