Improving Generative Adversarial Imitation Learning with Non-expert Demonstrations

Voot Tangkaratt; Masashi Sugiyama

Improving Generative Adversarial Imitation Learning with Non-expert Demonstrations

Voot Tangkaratt, Masashi Sugiyama

27 Sept 2018 (modified: 05 May 2023)ICLR 2019 Conference Blind SubmissionReaders: Everyone

Abstract: Imitation learning aims to learn an optimal policy from expert demonstrations and its recent combination with deep learning has shown impressive performance. However, collecting a large number of expert demonstrations for deep learning is time-consuming and requires much expert effort. In this paper, we propose a method to improve generative adversarial imitation learning by using additional information from non-expert demonstrations which are easier to obtain. The key idea of our method is to perform multiclass classification to learn discriminator functions where non-expert demonstrations are regarded as being drawn from an extra class. Experiments in continuous control tasks demonstrate that our method learns better policies than the generative adversarial imitation learning baseline when the number of expert demonstrations is small.

Keywords: Imitation learning, Generative adversarial imitation learning

TL;DR: We improve GAIL by learning discriminators using multiclass classification with non-expert regarded as an extra class.

14 Replies

Loading