Phase-shifted Adversarial TrainingDownload PDF

Published: 08 May 2023, Last Modified: 26 Jun 2023UAI 2023Readers: Everyone
Keywords: Adversarial Training, Frequency Principle
TL;DR: we present a novel approach to adversarial training, coined phase-shifted adversarial training, to learn the adversarial examples in an efficient manner.
Abstract: Adversarial training (AT) has been considered an imperative component for safely deploying neural network-based applications. However, it typically comes with slow convergence and worse performance on clean samples (i.e., non-adversarial samples). In this work, we analyze the behavior of neural networks during learning with adversarial samples through the lens of response frequency. Interestingly, we observe that AT causes neural networks to converge slowly to high-frequency information, resulting in highly oscillatory predictions near each data point. To learn high-frequency content efficiently, we first prove that a universal phenomenon, the frequency principle (i.e., lower frequencies are learned first), still holds in AT. Building upon this theoretical foundation, we present a novel approach to AT, which we call phase-shifted adversarial training (PhaseAT). In PhaseAT, the high-frequency components, which are a contributing factor to slow convergence, are adaptively shifted into the low-frequency range where faster convergence occurs. For evaluation, we conduct extensive experiments on CIFAR-10 and ImageNet, using an adaptive attack that is carefully designed for reliable evaluation. Comprehensive results show that PhaseAT substantially improves convergence for high-frequency information, thereby leading to improved adversarial robustness.
Supplementary Material: pdf
Other Supplementary Material: zip
0 Replies