Automated Discovery of Adaptive Attacks on Adversarial DefensesDownload PDF

Published: 14 Jul 2021, Last Modified: 22 Oct 2023AutoML@ICML2021 OralReaders: Everyone
Abstract: Reliable evaluation of adversarial defenses is a challenging task, currently limited to an expert who manually crafts attacks that exploit the defense’s inner workings, or to approaches based on ensemble of fixed attacks, none of which may be effective for the specific defense at hand. Our key observation is that custom attacks are composed from a set of reusable building blocks, such as fine-tuning relevant attack parameters, network transformations, and custom loss functions. Based on this observation, we present an extensible framework that defines a search space over these reusable building blocks and automatically discovers an effective attack on a given model with an unknown defense by searching over suitable combinations of these blocks. We evaluated our framework on 23 adversarial defenses and showed it outperforms AutoAttack, the current state-of-the-art tool for reliable evaluation of adversarial defenses: our discovered attacks are either stronger, producing 3.0%-50.8% additional adversarial examples (10 cases), or are typically 2x faster while enjoying similar adversarial robustness (13 cases).
Ethics Statement: In this paper, the authors propose an approach to improve the evaluation of adversarial defenses by automatically finding adaptive adversarial attacks. In general, such tool can be used both in a geneficial way by the researches developing adversarial defenses, as well as, in a malicious way by an attacker trying to break existing models. In both cases, the approach is designed to improve empirical model evaluation, rather than providing verified model robusness, and thus is not indended to provide formal robustness guarantees for safety critical applications.
Crc Pdf: pdf
Poster Pdf: pdf
Original Version: pdf
Community Implementations: [![CatalyzeX](/images/catalyzex_icon.svg) 1 code implementation](https://www.catalyzex.com/paper/arxiv:2102.11860/code)
4 Replies

Loading