CUPID: Curating Data your Robot Loves with Influence Functions

Published: 08 Aug 2025, Last Modified: 16 Sept 2025CoRL 2025 PosterEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Imitation Learning, Data Curation, Influence Functions
TL;DR: We propose a data curation method for robot imitation learning that uses influence functions to measure the causal impact of a demonstration on the policy's closed-loop performance.
Abstract: In robot imitation learning, policy performance is tightly coupled with the quality and composition of the demonstration data. Yet, developing a precise understanding of how individual demonstrations contribute to downstream outcomes—such as closed-loop task success or failure—remains a persistent challenge. Inspired by the theory of influence functions, we propose CUPID. Given a set of evaluation rollouts, CUPID estimates the influence of a training demonstration on the policy’s expected return. This enables ranking and selection of demonstrations according to their impact on the policy’s closed-loop performance. We use our estimator to curate data by 1) filtering out training demonstrations that harmed the policy’s performance and 2) subselecting newly collected trajectories that will most help improve the policy. Extensive simulated and hardware experiments show that our approach consistently identifies which data drives test-time performance. For example, training with less than 33% of curated data can result in state-of-the-art diffusion policies on the simulated Robomimic benchmark, and we observe similar improvements in hardware experiments. Furthermore, our hardware experiments show that our influence-based estimator can identify robust strategies under distribution shift, isolate spurious correlations, and even enhance post-training of generalist policies.
Supplementary Material: zip
Spotlight: mp4
Submission Number: 910
Loading