Over-Training with Mixup May Hurt Generalization

Zixuan Liu; Ziqiao Wang; Hongyu Guo; Yongyi Mao

Over-Training with Mixup May Hurt Generalization

Zixuan Liu, Ziqiao Wang, Hongyu Guo, Yongyi Mao

Published: 01 Feb 2023, Last Modified: 02 Mar 2023ICLR 2023 posterReaders: Everyone

Keywords: Mixup, Generalization, Overfitting, Regularization

TL;DR: We empirically discovered a U-shaped generalization curve of Mixup training.

Abstract: Mixup, which creates synthetic training instances by linearly interpolating random sample pairs, is a simple and yet effective regularization technique to boost the performance of deep models trained with SGD. In this work, we report a previously unobserved phenomenon in Mixup raining: on a number of standard datasets, the performance of Mixup-trained models starts to decay after training for a large number of epochs, giving rise to a U-shaped generalization curve. This behavior is further aggravated when the size of original dataset is reduced. To help understand such a behavior of Mixup, we show theoretically that Mixup training may introduce undesired data-dependent label noises to the synthesized data. Via analyzing a least-square regression problem with a random feature model, we explain why noisy labels may cause the U-shaped curve to occur: Mixup improves generalization through fitting the clean patterns at the early training stage, but as training progresses, Mixup becomes over-fitting to the noise in the synthetic data. Extensive experiments are performed on a variety of benchmark datasets, validating this explanation.

Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.

No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics

Submission Guidelines: Yes

Please Choose The Closest Area That Your Submission Falls Into: Deep Learning and representational learning

Supplementary Material: zip

13 Replies

Loading