Keywords: Imitation Learning, Data Curation, Large Robot Datasets
TL;DR: We learn to estimate each data-point's effect on imitation learning performance via Datamodels; we then leverage these models to select data that maximizes downstream policy success
Abstract: Recently, the robotics community has amassed ever larger and more diverse datasets to train generalist robot policies. However, while these policies achieve strong mean performance across a variety of tasks, they often underperform on individual, specialized tasks and require further tuning on newly acquired task-specific data. Combining task-specific data with carefully curated subsets of large prior datasets can produce better specialized policies, but selecting data naively may actually harm downstream performance. To address this, we introduce DataMIL, a data selection framework built on the datamodels paradigm that reasons about data selection in an end-to-end manner, using the policy itself to identify which data points will most improve performance. Unlike standard practices that filter data using human notions of quality (e.g., semantic or visual similarity), DataMIL directly optimizes data selection for task success, allowing us to select data that enhance the policy while dropping data that degrade it. To avoid performing expensive rollouts in the environment during selection, we use a surrogate loss function on task-specific data, allowing us to use DataMIL in the real world without degrading performance. We validate our approach on a suite of 60+ simulation and real-world manipulation tasks—most notably showing successful data selection from the Open X-Embodiment datasets. Our results underscore the importance of end-to-end, performance-aware data selection for unlocking the potential of large prior datasets in robotics.
Supplementary Material: zip
Lightning Talk Video: mp4
Optional Poster Upload: pdf
Submission Number: 33
Loading