- Abstract: A noisy and diverse demonstration set may hinder the performances of an agent aiming to acquire certain skills via imitation learning. However, state-of-the-art imitation learning algorithms often assume the optimality of the given demonstration set. In this paper, we address such optimal assumption by learning only from the most suitable demonstrations in a given set. Suitability of a demonstration is estimated by whether imitating it produce desirable outcomes for achieving the goals of the tasks. For more efficient demonstration suitability assessments, the learning agent should be capable of imitating a demonstration as quick as possible, which shares similar spirit with fast adaptation in the meta-learning regime. Our framework, thus built on top of Model-Agnostic Meta-Learning, evaluates how desirable the imitated outcomes are, after adaptation to each demonstration in the set. The resulting assessments hence enable us to select suitable demonstration subsets for acquiring better imitated skills. The videos related to our experiments are available at: https://sites.google.com/view/deepdj
- Keywords: Imitation Learning, Noisy Demonstration Set, Meta-Learning
- TL;DR: We propose a framework to learn a good policy through imitation learning from a noisy demonstration set via meta-training a demonstration suitability assessor.