Difference-Aware Retrieval Polices for Imitation Learning

ICLR 2026 Conference Submission22159 Authors

20 Sept 2025 (modified: 08 Oct 2025)ICLR 2026 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: behavior cloning, robotics, imitation learning, nearest neighbor, retrieval
TL;DR: DARP addresses behavior cloning's instability by conditioning action predictions on difference vectors between query states and neighbors
Abstract: Behavior cloning suffers from poor generalization to out-of-distribution states due to compounding errors during deployment. We present Difference-Aware Retrieval Polices for Imitation Learning (DARP), a novel nearest-neighbor-based imitation learning approach that addresses this limitation by reparameterizing the imitation learning problem in terms of local neighborhood structure rather than direct state-to-action mappings. Instead of learning a global policy, DARP trains a model to predict actions based on k-nearest neighbors from expert demonstrations, their corresponding actions, and the relative distance vectors between neighbor states and query states. Our method requires no additional data collection, online expert feedback, or task-specific knowledge beyond standard behavior cloning prerequisites. We demonstrate consistent performance improvements of 15-46\% over standard behavior cloning across diverse domains, including continuous control and robotic manipulation, and across different representations, including high-dimensional visual features.
Supplementary Material: zip
Primary Area: applications to robotics, autonomy, planning
Submission Number: 22159
Loading