AGPD: Adaptive Guidance policy distillation for Imitation Learning

Linrui Gong; Zihao Liu; Lihao Liu; Yaonan Wang; Hesheng Wang

AGPD: Adaptive Guidance policy distillation for Imitation Learning

Linrui Gong, Zihao Liu, Lihao Liu, Yaonan Wang, Hesheng Wang

18 Sept 2025 (modified: 11 Feb 2026)Submitted to ICLR 2026EveryoneRevisionsBibTeXCC BY 4.0

Keywords: Imitation Learning, Policy Distillation, Robotics Manipulation, Diffusion Models

Abstract: Imitation Learning (IL) has been proven to be effective in training a policy network for long-horizon tasks under complex, unstructured environments. However, its effectiveness heavily relies on access to large-scale, high-quality datasets. To tackle this challenge, we introduce \AGIL~(\textbf{A}daptive \textbf{G}uidance \textbf{P}olicy \textbf{D}istillation for Imitation Learning), a novel policy distillation framework that transfers knowledge from a pretrained Diffusion Model (DM) to a student policy. \AGIL~ utilizes the pretrained DM to generate samples with diverse state observations for training the student policy. Unlike conventional distillation approaches, \AGIL~ introduces a mechanism where the pretrained DM generates samples by explicitly modeling the discrepancy between student and teacher policies. Furthermore, we theoretically prove that this generation approach explores a maximally large transition diversity with a smaller action divergence, thereby outperforming methods that generation through action mixing. Moreover, \AGIL~ incorporates a discriminator to assess policy quality and encourage students to mimic behaviors that closely align with the teacher, further enhancing learning from mixed-quality data. Finally, experiments across robomimic simulation and real-world environments with multiple challenging tasks demonstrate the effectiveness of \AGIL.

Supplementary Material: zip

Primary Area: applications to robotics, autonomy, planning

Submission Number: 10547

Loading