On robust overfitting: adversarial training induced distribution matters

Published: 07 Nov 2023, Last Modified: 13 Dec 2023M3L 2023 PosterEveryoneRevisionsBibTeX
Keywords: robust overfitting, adversarial training, generalization
Abstract: Robust overfitting has been observed to arise in adversarial training. We hypothesize that this phenomenon may be related to the evolution of the data distribution along the training trajectory. To investigate this, we select a set of checkpoints in adversarial training and perform standard training on distributions induced by adversarial perturbation w.r.t the checkpoints. We observe that the obtained models become increasingly harder to generalize when robust overfitting occurs, thereby validating the hypothesis. We show the hardness of generalization on the induced distributions is related to certain local property of the perturbation operator at each checkpoint. The connection between the local property and the generalization on the induced distribution is proved by establishing an upper bound of the generalization error. Other interesting phenomena related to the adversarial training trajectory are also observed.
Submission Number: 68
Loading