Keywords: Imitation Learning, Policy Distillation, Robotics Manipulation, Diffusion Models
Abstract: Imitation Learning (IL) has been proven to be effective in training a policy network for long-horizon tasks under complex, unstructured environments. However, its effectiveness heavily relies on access to large-scale, high-quality datasets.
To tackle this challenge, we introduce \AGIL~(\textbf{A}daptive \textbf{G}uidance \textbf{P}olicy \textbf{D}istillation for Imitation Learning), a novel policy distillation framework that transfers knowledge from a pretrained Diffusion Model (DM) to a student policy.
\AGIL~ utilizes the pretrained DM to generate samples with diverse state observations for training the student policy.
Unlike conventional distillation approaches, \AGIL~ introduces a mechanism where the pretrained DM generates samples by explicitly modeling the discrepancy between student and teacher policies.
Furthermore, we theoretically prove that this generation approach explores a maximally large transition diversity with a smaller action divergence, thereby outperforming methods that generation through action mixing.
Moreover, \AGIL~ incorporates a discriminator to assess policy quality and encourage students to mimic behaviors that closely align with the teacher, further enhancing learning from mixed-quality data.
Finally, experiments across robomimic simulation and real-world environments with multiple challenging tasks demonstrate the effectiveness of \AGIL.
Supplementary Material: zip
Primary Area: applications to robotics, autonomy, planning
Submission Number: 10547
Loading