CUPID: Curating Data your Robot Loves with Influence Functions

Christopher Agia; Rohan Sinha; Jingyun Yang; Rika Antonova; Marco Pavone; Haruki Nishimura; Masha Itkina; Jeannette Bohg

CUPID: Curating Data your Robot Loves with Influence Functions

Christopher Agia, Rohan Sinha, Jingyun Yang, Rika Antonova, Marco Pavone, Haruki Nishimura, Masha Itkina, Jeannette Bohg

Published: 12 Jun 2025, Last Modified: 22 Jun 2025RobotEvaluation@RSS 2025 PosterEveryoneRevisionsBibTeXCC BY 4.0

Type: An approach-centric paper (introducing new robot systems and approaches with a strong emphasis on real-world applicability and evaluation)

Keywords: Imitation Learning, Data Curation, Influence Functions

TL;DR: We propose a data curation method for robot imitation learning that uses influence functions to measure the causal impact of a demonstration on the policy's closed-loop performance.

Abstract: In robot imitation learning, policy performance is tightly coupled with the quality and composition of the demonstration data. Yet, developing a precise understanding of how individual demonstrations contribute to downstream outcomes – such as closed-loop task success or failure – remains a persistent challenge. We propose CUPID, a robot data curation method based on a novel influence function-theoretic formulation for imitation learning policies. Given a set of evaluation rollouts, CUPID estimates the influence of each training demonstration on the policy's expected return. This enables ranking and selection of demonstrations according to their impact on the policy's closed-loop performance. We use CUPID to curate data by 1) filtering out training demonstrations that harm policy performance and 2) subselecting newly collected trajectories that will most improve the policy. Extensive simulated and hardware experiments show that CUPID can significantly improve policy performance in mixed-quality regimes, identify robust strategies under test-time distribution shifts, and even disentangle spurious correlations in training data that hinder generalization. Additional materials are made available at: https://cupid-curation.github.io.

Submission Number: 3

Loading