Fighting Gradients with Gradients: Dynamic Defenses against Adversarial AttacksDownload PDF

21 May 2021 (modified: 05 May 2023)NeurIPS 2021 SubmittedReaders: Everyone
Keywords: adversarial, robustness, defense, dynamic
Abstract: Adversarial attacks optimize against models to defeat defenses. Existing defenses are static, and stay the same once trained, even while attacks change. We argue that models should fight back, and optimize their defenses against attacks at test time. We propose dynamic defenses, to adapt the model and input during testing, by defensive entropy minimization (dent). Dent alters testing, but not training, for compatibility with existing models and train-time defenses. Dent improves the robustness of adversarially-trained defenses and nominally-trained models against white-box, black-box, and adaptive attacks on CIFAR-10/100 and ImageNet. In particular, dent boosts state-of-the-art defenses by 20+ points absolute against AutoAttack on CIFAR-10 at $\epsilon_\infty = 8/255$.
Code Of Conduct: I certify that all co-authors of this work have read and commit to adhering to the NeurIPS Statement on Ethics, Fairness, Inclusivity, and Code of Conduct.
TL;DR: Adversarial attacks optimize against defenses, so defenses should fight back, and optimize against attacks at test-time: dent updates by entropy minimization to boost the robustness of adversarially-trained and nominally-trained models.
Supplementary Material: zip
13 Replies

Loading