StoRM: Stochastic Region Mixup

ICLR 2026 Conference Submission25053 Authors

20 Sept 2025 (modified: 08 Oct 2025)ICLR 2026 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Mixup, Data augmentation, Vicinal Risk Minimization
Abstract: A number of data-augmentation strategies have been proposed to alleviate problems such as over-fitting, distribution shifts, and adversarial attacks in deep neural networks. A growing body of literature has investigated computationally expensive techniques like inclusion of saliency cues, diffusion processes or even fractal-like noise to improve upon robustness, clean accuracy. Although these methods may be intuitively compelling, there is limited theoretical justification for such techniques, especially given their computational inefficiencies and other issues. Thus, in this paper, we take a detour from them and propose Stochastic Region Mixup (StoRM). We simply focus on increasing the diversity of augmented samples. We show that this strategy can be extended to outperform saliency-based methods with lower computational overheads in several key metrics, and the key bottleneck in mixup based methods is the dimensionality of the vicinial risk space. StoRM—a stochastic extension of Region Mixup—stochastically combines multiple regions from a plurality of images leading to more diverse augmentations. We present empirical studies and theoretical analysis demonstrating that this richer augmentation space yields improved generalization and robustness while preserving label integrity through careful area-based mixing. Across benchmarks, StoRM consistently outperforms state-of-the-art mixup methods. The code will be released publicly upon acceptance.
Supplementary Material: pdf
Primary Area: other topics in machine learning (i.e., none of the above)
Submission Number: 25053
Loading