Keywords: fast adversarial training, bi-level optimization, adversarial robustness, adversarial defense
Abstract: Adversarial training (AT) has become a widely recognized defense mechanism to improve the robustness of deep neural networks against adversarial attacks. It is originated from solving a min-max optimization problem, where the minimizer (i.e., defender) seeks a robust model to minimize the worst-case training loss at the presence of adversarial examples crafted by the maximizer (i.e., attacker). However,the min-max nature makes AT computationally intensive and thus difficult to scale. Thus, the problem of FAST-AT arises. Nearly all the recent progress is achieved based on the following simplification: The iterative attack generation method used in the maximization step of AT is replaced by the simplest one-shot gradient sign-based PGD method. Nevertheless, FAST-AT is far from satisfactory, and it lacks theoretically-grounded design. For example, a FAST-AT method may suffer from robustness catastrophic overfitting when training with strong adversaries.
In this paper, we foster a technological breakthrough for designing FAST-AT through the lens of bi-level optimization (BLO) instead of min-max optimization. First, we theoretically show that the most commonly-used algorithmic specification of FAST-AT is equivalent to the linearized BLO along the direction given by the sign of input gradient. Second, with the aid of BLO, we develop a new systematic and effective fast bi-level AT framework, termed FAST-BAT, whose algorithm is rigorously derived by leveraging the theory of implicit gradient. In contrast to FAST-AT, FAST-BAT has the least restriction to placing the tradeoff between computation efficiency and adversarial robustness. For example, it is capable of defending sign-based projected gradient descent (PGD) attacks without calling any gradient sign method and explicit robust regularization during training. Furthermore, we empirically show that our method outperforms state-of-the-art FAST-AT baselines. In particular, FAST-BAT can achieve superior model robustness without inducing robustness catastrophic overfitting and losing standard accuracy.
One-sentence Summary: In this paper, we introduce a novel bi-level optimization (BLO)-based fast adversarial training framework, termed FAST-BAT.
Supplementary Material: zip
33 Replies
Loading