Adversarial Robustness via Adaptive Label SmoothingDownload PDF

29 Sept 2021 (modified: 13 Feb 2023)ICLR 2022 Conference Withdrawn SubmissionReaders: Everyone
Abstract: Adversarial training (AT) has become a dominant defense paradigm by enforcing the model's predictions to be locally invariant to adversarial examples. Being a simple technique, Label smoothing (LS) has shown its potential for improving model robustness. However, the prior study shows the benefit of directly combining two techniques together is limited. In this paper, we aim to better understand the behavior of LS and explore new algorithms for more effective LS on improving adversarial robustness. We first show both theoretically and empirically that strong smoothing in AT increases local smoothness of the loss surface which is beneficial for robustness but sacrifices the training loss which influences the accuracy of samples near the decision boundary. Based on this result, we propose \textit{surface smoothing adversarial training} (SSAT). Specifically, much stronger smoothness is used on the perturbed examples farther away from the decision boundary to achieve better robustness, while weaker smoothness is on those closer to the decision boundary to avoid incorrect classification on clean samples. Meanwhile, LS builds a different representation space among data classes in which SSAT differs from other AT methods. We study such a distinction and further propose a cooperative defense strategy termed by Co-SSAT. Experimental results show that our Co-SSAT achieves the state-of-the-art performances on CIFAR-10 with $\ell_{\infty}$ adversaries and also has a good generalization ability of unseen attacks, i.e., other $\ell_p$ norms, or larger perturbations due to the smoothness property of the loss surface.
6 Replies

Loading