ExploreAugment: Adaptive Exploratory Data Augmentation based on Boundary Awareness

18 Sept 2025 (modified: 13 Nov 2025)ICLR 2026 Conference Withdrawn SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Data Augmentation, Boundary-Aware Learning, Diffusion Models, Data Efficiency, Model-Aware Augmentation, Classification
TL;DR: ExploreAugment is a model-aware data augmentation framework that generates boundary-focused samples via latent interpolation, achieving higher accuracy with only about 15% of the augmented data used by traditional methods.
Abstract: Traditional data augmentation often applies uniform transformations across all samples, prioritizing data volume over addressing specific model limitations. This indiscriminate approach can lead to redundant data expansion and inefficient training. We propose ExploreAugment, a novel model-aware data augmentation framework that dynamically targets and refines decision-critical regions in the latent space. Our method first identifies key samples using task-specific selection strategies. Then, it leverages diffusion-based latent interpolation to generate samples that are boundary-ambiguous yet semantically valid. These tailored samples are seamlessly integrated into training via a closed-loop pipeline that continuously adapts to the evolving model state. Extensive experiments across multiple datasets demonstrate that ExploreAugment consistently enhances task performance while significantly reducing augmentation overhead. Notably, our approach outperforms the best baseline by 7.14\% on ResNet-50 and 1.75\% on DeiT, achieving these gains using only about 15\% of the data volume generated by other augmentation methods. This highlights the significant advantage of our boundary-aware, model-driven augmentation for achieving data-efficient learning.
Supplementary Material: zip
Primary Area: generative models
Submission Number: 11386
Loading