Keywords: robot manipulation, imitation learning, human-in-the-loop learning
TL;DR: A system that improves real-world contact-rich robot manipulation policies with human corrections
Abstract: We address key challenges in Dataset Aggregation (DAgger) for real-world contact-
rich manipulation: how to collect informative human correction data and how to
effectively update policies with this new data. We introduce Compliant Residual
DAgger (CR-DAgger), which contains two novel components: 1) a Compliant
Intervention Interface that leverages compliance control, allowing humans to pro-
vide gentle, accurate delta action corrections without interrupting the ongoing
robot policy execution; and 2) a Compliant Residual Policy formulation that learns
from human corrections while incorporating force feedback and force control.
Our system significantly enhances performance on precise contact-rich manipu-
lation tasks using minimal correction data, improving base policy success rates
by over 60% on two challenging tasks (book flipping and belt assembly) while
outperforming both retraining-from-scratch and finetuning approaches. Through
extensive real-world experiments, we provide practical guidance for implementing
effective DAgger in real-world robot learning tasks.
Supplementary Material: zip
Primary Area: Applications (e.g., vision, language, speech and audio, Creative AI)
Submission Number: 10973
Loading