Keywords: Imitation Learning, Reinforcement Learning, Model Predictive Control
TL;DR: Towards real-world IRL for robot learning, we propose planning-based Adversarial Imitation Learning which simultaneously learns a reward and improves a planning-based agent through interaction and observation-only demonstrations.
Abstract: Humans can often perform a new task after observing a few demonstrations by inferring the underlying intent. For robots, recovering the intent of the demonstrator through a learned reward function can enable more efficient, interpretable, and robust imitation through planning. A common paradigm for learning how to plan-from-demonstration involves first solving for a reward via Inverse Reinforcement Learning (IRL) and then deploying it via Model Predictive Control (MPC). In this work, we unify these two procedures by introducing planning-based Adversarial Imitation Learning, which simultaneously learns a reward and improves a planning-based agent through experience while using observation-only demonstrations. We study advantages of planning-based AIL in generalization, interpretability, robustness, and sample efficiency through experiments in simulated control tasks and real-world navigation from few or single observation-only demonstration.
Supplementary Material: zip
Primary Area: applications to robotics, autonomy, planning
Submission Number: 2874
Loading