Eliciting Compatible Demonstrations for Multi-Human Imitation Learning

Kanishk Gandhi; Siddharth Karamcheti; Madeline Liao; Dorsa Sadigh

Eliciting Compatible Demonstrations for Multi-Human Imitation Learning

Kanishk Gandhi, Siddharth Karamcheti, Madeline Liao, Dorsa Sadigh

Published: 10 Sept 2022, Last Modified: 27 Apr 2025CoRL 2022 PosterReaders: Everyone

Keywords: Interactive Imitation Learning, Active Demonstration Elicitation, Human Robot Interaction

TL;DR: We introduce an approach for measuring the compatibility between a base policy and a given user demonstration. We use this compatibility measure to actively elicit demonstrations from multiple humans to improve performance on manipulation tasks.

Abstract: Imitation learning from human-provided demonstrations is a strong approach for learning policies for robot manipulation. While the ideal dataset for imitation learning is homogenous and low-variance - reflecting a single, optimal method for performing a task - natural human behavior has a great deal of heterogeneity, with several optimal ways to demonstrate a task. This multimodality is inconsequential to human users, with task variations manifesting as subconscious choices; for example, reaching down, then across to grasp an object, versus reaching across, then down. Yet, this mismatch presents a problem for interactive imitation learning, where sequences of users improve on a policy by iteratively collecting new, possibly conflicting demonstrations. To combat this problem of demonstrator incompatibility, this work designs an approach for 1) measuring the compatibility of a new demonstration given a base policy, and 2) actively eliciting more compatible demonstrations from new users. Across two simulation tasks requiring long-horizon, dexterous manipulation and a real-world ``food plating'' task with a Franka Emika Panda arm, we show that we can both identify incompatible demonstrations via post-hoc filtering, and apply our compatibility measure to actively elicit compatible demonstrations from new users, leading to improved task success rates across simulated and real environments.

Student First Author: yes

Supplementary Material: zip

Website: https://sites.google.com/view/eliciting-demos-corl22/home

Community Implementations: [![CatalyzeX](/images/catalyzex_icon.svg) 1 code implementation](https://www.catalyzex.com/paper/eliciting-compatible-demonstrations-for-multi/code)

15 Replies

Loading