Testing Robustness Against Unforeseen Adversaries

Daniel Kang; Yi Sun; Dan Hendrycks; Tom B Brown; Jacob Steinhardt

Testing Robustness Against Unforeseen Adversaries

Daniel Kang, Yi Sun, Dan Hendrycks, Tom B Brown, Jacob Steinhardt

28 Sept 2020 (modified: 22 Jun 2025)ICLR 2021 Conference Blind SubmissionReaders: Everyone

Keywords: adversarial examples, adversarial training, adversarial attacks

Abstract: Most existing adversarial defenses only measure robustness to $L_p$ adversarial attacks. Not only are adversaries unlikely to exclusively create small $L_p$ perturbations, adversaries are unlikely to remain fixed. Adversaries adapt and evolve their attacks; hence adversarial defenses must be robust to a broad range of unforeseen attacks. We address this discrepancy between research and reality by proposing a new evaluation framework called ImageNet-UA. Our framework enables the research community to test ImageNet model robustness against attacks not encountered during training. To create ImageNet-UA's diverse attack suite, we introduce a total of four novel adversarial attacks. We also demonstrate that, in comparison to ImageNet-UA, prevailing $L_\infty$ robustness assessments give a narrow account of adversarial robustness. By evaluating current defenses with ImageNet-UA, we find they provide little robustness to unforeseen attacks. We hope the greater variety and realism of ImageNet-UA enables development of more robust defenses which can generalize beyond attacks seen during training.

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics

One-sentence Summary: We propose several new attacks and a framework to measure robustness against unforeseen adversarial attacks.

Community Implementations: [![CatalyzeX](/images/catalyzex_icon.svg) 3 code implementations](https://www.catalyzex.com/paper/testing-robustness-against-unforeseen/code)

Reviewed Version (pdf): https://openreview.net/references/pdf?id=kzhU8U6tP9

10 Replies

Loading