Abstract: Coreset selection reduces training cost by constructing compact, representative subsets, but existing methods largely assume balanced class distributions. Under imbalance, this assumption yields biased subsets that discard critical minority samples and degrade accuracy. We propose Equitable Coreset Selection (ECS), a framework tailored for imbalanced data. ECS mitigates these issues through adaptive pruning that preserves minority examples, class-sensitive partitioning aligned with skewed class distributions, and stratified graph-cut selection for diverse sampling. Experiments across multiple imbalanced datasets show that ECS improves generalization and substantially boosts minority-class accuracy compared to standard coreset methods.
External IDs:doi:10.1145/3746252.3760971
Loading