MODE: Multi-Objective Dynamic Coreset Selection

ICLR 2026 Conference Submission12768 Authors

18 Sept 2025 (modified: 08 Oct 2025)ICLR 2026 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Coreset selection, Submodularity
TL;DR: MODE adaptively selects the most useful data throughout training, achieving theoretical guarantees and scalable efficiency while preserving model performance with far less data.
Abstract: We present \mode (Multi-Objective adaptive Data Efficiency), a framework that dynamically combines coreset selection strategies based on their evolving contribution to model performance. Unlike static methods, \mode adapts selection criteria to training phases: emphasizing class balance early, diversity during representation learning, and uncertainty at convergence. We show that MODE achieves $(1-1/e)$-approximation with $O(n \log n)$ complexity and demonstrate competitive accuracy while providing interpretable insights into data utility evolution. Experiments show \mode reduces memory requirements %by 10× on ImageNet while providing actionable insights about which data types matter most during different training phases.
Primary Area: applications to computer vision, audio, language, and other modalities
Submission Number: 12768
Loading