Model-Based Adversarial Imitation Learning As Online Fine-Tuning

Rafael Rafailov; Victor Kolev; Kyle Beltran Hatch; John D Martin; Mariano Phielipp; Jiajun Wu; Chelsea Finn

Model-Based Adversarial Imitation Learning As Online Fine-Tuning

Rafael Rafailov, Victor Kolev, Kyle Beltran Hatch, John D Martin, Mariano Phielipp, Jiajun Wu, Chelsea Finn

Published: 03 Mar 2023, Last Modified: 12 Apr 2023RRL 2023 PosterReaders: Everyone

Keywords: imitation learning, model-based reinforcement learning, offline reinforcement learning, fine-tuning

TL;DR: We argue that offline model-based RL algorithms are better suited to the adversarial imitation learning setting than online algorithms.

Abstract: In many real world applications of sequential decision-making problems, such as robotics or autonomous driving, expert-level data is available (or easily obtainable) with methods such as tele-operation. However, directly learning to copy these expert behaviours can result in poor performance due to distribution shift at deployment time. Adversarial imitation learning algorithms alleviate this issue by learning to match the expert state-action distribution through additional environment interactions. Such methods are built around standard reinforcement-learning algorithms with both model-based and model-free approaches. In this work we focus on the model-based approach and argue that algorithms developed for online RL are sub-optimal for the distribution matching problem. We theoretically justify utilizing conservative algorithms developed for the offline learning paradigm in online adversarial imitation learning and empirically demonstrate improved performance and safety on a complex long-range robot manipulation task, directly from images.

Track: Opinion Paper

Confirmation: I have read and agree with the workshop's policy on behalf of myself and my co-authors.

2 Replies

Loading