Double Descent in Adversarial Training: An Implicit Label Noise Perspective

Chengyu Dong; Liyuan Liu; Jingbo Shang

Double Descent in Adversarial Training: An Implicit Label Noise Perspective

Chengyu Dong, Liyuan Liu, Jingbo Shang

Published: 28 Jan 2022, Last Modified: 13 Feb 2023ICLR 2022 SubmittedReaders: Everyone

Keywords: Adversarial training, Robust overfitting, Double descent, Label noise

Abstract: Here, we show that the robust overfitting shall be viewed as the early part of an epoch-wise double descent --- the robust test error will start to decrease again after training the model for a considerable number of epochs. Inspired by our observations, we further advance the analyses of double descent to understand robust overfitting better. In standard training, double descent has been shown to be a result of label flipping noise. However, this reasoning is not applicable in our setting, since adversarial perturbations are believed not to change the label. Going beyond label flipping noise, we propose to measure the mismatch between the assigned and (unknown) true label distributions, denoted as \emph{implicit label noise}. We show that the traditional labeling of adversarial examples inherited from their clean counterparts will lead to implicit label noise. Towards better labeling, we show that predicted distribution from a classifier, after scaling and interpolation, can provably reduce the implicit label noise under mild assumptions. In light of our analyses, we tailored the training objective accordingly to effectively mitigate the double descent and verified its effectiveness on three benchmark datasets.

One-sentence Summary: We update the understanding of robust overfitting by showing that it is the early part of a epoch-wise double descent, and design effective algorithm to mitigate it.

Supplementary Material: zip

17 Replies

Loading