AGPD: Adaptive Guidance policy distillation for Imitation Learning

18 Sept 2025 (modified: 11 Feb 2026)Submitted to ICLR 2026EveryoneRevisionsBibTeXCC BY 4.0
Keywords: Imitation Learning, Policy Distillation, Robotics Manipulation, Diffusion Models
Abstract: Imitation Learning (IL) has been proven to be effective in training a policy network for long-horizon tasks under complex, unstructured environments. However, its effectiveness heavily relies on access to large-scale, high-quality datasets. To tackle this challenge, we introduce \AGIL~(\textbf{A}daptive \textbf{G}uidance \textbf{P}olicy \textbf{D}istillation for Imitation Learning), a novel policy distillation framework that transfers knowledge from a pretrained Diffusion Model (DM) to a student policy. \AGIL~ utilizes the pretrained DM to generate samples with diverse state observations for training the student policy. Unlike conventional distillation approaches, \AGIL~ introduces a mechanism where the pretrained DM generates samples by explicitly modeling the discrepancy between student and teacher policies. Furthermore, we theoretically prove that this generation approach explores a maximally large transition diversity with a smaller action divergence, thereby outperforming methods that generation through action mixing. Moreover, \AGIL~ incorporates a discriminator to assess policy quality and encourage students to mimic behaviors that closely align with the teacher, further enhancing learning from mixed-quality data. Finally, experiments across robomimic simulation and real-world environments with multiple challenging tasks demonstrate the effectiveness of \AGIL.
Supplementary Material: zip
Primary Area: applications to robotics, autonomy, planning
Submission Number: 10547
Loading