Model Predictive Adversarial Imitation Learning for Planning from Observation

Tyler Han; Yanda Bao; Bhaumik Mehta; Gabriel Guo; Anubhav Vishwakarma; Emily Kang; Sanghun Jung; Rosario Scalise; Jason Liren Zhou; Bryan Xu; Byron Boots

Model Predictive Adversarial Imitation Learning for Planning from Observation

Tyler Han, Yanda Bao, Bhaumik Mehta, Gabriel Guo, Anubhav Vishwakarma, Emily Kang, Sanghun Jung, Rosario Scalise, Jason Liren Zhou, Bryan Xu, Byron Boots

Published: 16 Sept 2025, Last Modified: 25 Sept 2025CoRL 2025 PosterEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Model Predictive Control, Imitation Learning, Reinforcement Learning

TL;DR: We introduce planning-based Adversarial Imitation Learning towards interpretable, steerable, robust, and sample-efficient learning from observation.

Abstract: Humans can often perform a new task after a few demonstrations by inferring the underlying intent. For robots, recovering the intent of the demonstrator through a learned reward function can enable more efficient, interpretable, and robust imitation through planning. A common planning-from-demonstration paradigm involves first learning a reward via Inverse Reinforcement Learning (IRL) and then deploying it via Model Predictive Control (MPC). In this work, we unify these two procedures by introducing planning-based Adversarial Imitation Learning, which simultaneously learns a reward and improves a planning-based agent through experience while using observation-only demonstrations. We study advantages of planning-based AIL in generalization, interpretability, robustness, and sample efficiency through experiments in simulated control tasks and real-world navigation from few- and single-demonstration.

Submission Number: 3

Loading