Keywords: mixup, calibration, confidence, robustness
TL;DR: We show that taking distance into account in mixup reduces occurence of mismatch between mixed labels and mixed samples, improving confidence, calibration and robustness.
Abstract: Among all data augmentation techniques proposed so far, linear interpolation of training samples, also called Mixup, has found to be effective for a large panel of applications.
Along with improved predictive performance, Mixup is also a good technique for improving calibration.
However, mixing data carelessly can lead to manifold mismatch, i.e., synthetic data lying outside original class manifolds, which can deteriorate calibration.
In this work, we show that the likelihood of assigning a wrong label with mixup increases with the distance between data to mix.
To this end, we propose to dynamically change the underlying distributions of interpolation coefficients
depending on the similarity between samples to mix, and define a flexible framework to do so without losing in diversity. We provide extensive experiments for classification and regression tasks, showing that our proposed method improves predictive performance
and calibration of models, while being much more efficient.
Primary Area: alignment, fairness, safety, privacy, and societal considerations
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 9966
Loading