Detecting Incorrect Visual Demonstrations for Improved Policy LearningDownload PDF

16 Jun 2022, 10:45 (modified: 16 Nov 2022, 00:52)CoRL 2022 PosterReaders: Everyone
Student First Author: yes
Keywords: Imitation Learning, Visual Demonstrations, Incorrect Demonstrations
TL;DR: A framework for detecting incorrect visual demonstration for improved imitation learning
Abstract: Learning tasks only from raw video demonstrations is the current state of the art in robotics visual imitation learning research. The implicit assumption here is that all video demonstrations show an optimal/sub-optimal way of performing the task. What if that is not true? What if one or more videos show a wrong way of executing the task? A task policy learned from such incorrect demonstrations can be potentially unsafe for robots and humans. It is therefore important to analyze the video demonstrations for correctness before handing them over to the policy learning algorithm. This is a challenging task, especially due to the very large state space. This paper proposes a framework to autonomously detect incorrect video demonstrations of sequential tasks consisting of several sub-tasks. We analyze the demonstration pool to identify video(s) for which task-features follow a ‘disruptive’ sequence. We analyze entropy to measure this disruption and – through solving a minmax problem – assign poor weights to incorrect videos. We evaluated the framework with two real-world video datasets: our custom-designed Tea-Making with a YuMi robot and the publicly available 50-Salads. Experimental results show the effectiveness of the proposed framework in detecting incorrect video demonstrations even when they make up 40% of the demonstration set. We also show that various state-of-the-art imitation learning algorithms learn a better policy when incorrect demonstrations are discarded from the training pool.
Supplementary Material: zip
12 Replies