Automated Discovery of Adaptive Attacks on Adversarial Defenses

Chengyuan Yao; Pavol Bielik; PETAR TSANKOV; Martin Vechev

Automated Discovery of Adaptive Attacks on Adversarial Defenses

Chengyuan Yao, Pavol Bielik, PETAR TSANKOV, Martin Vechev

Published: 09 Nov 2021, Last Modified: 26 May 2025NeurIPS 2021 PosterReaders: Everyone

Keywords: Deep Learning, Adversarial Attacks, Adversarial Defences, Robustness

Abstract: Reliable evaluation of adversarial defenses is a challenging task, currently limited to an expert who manually crafts attacks that exploit the defense’s inner workings, or to approaches based on ensemble of fixed attacks, none of which may be effective for the specific defense at hand. Our key observation is that adaptive attacks are composed from a set of reusable building blocks that can be formalized in a search space and used to automatically discover attacks for unknown defenses. We evaluated our approach on 24 adversarial defenses and show that it outperforms AutoAttack, the current state-of-the-art tool for reliable evaluation of adversarial defenses: our tool discovered significantly stronger attacks by producing 3.0%-50.8% additional adversarial examples for 10 models, while obtaining attacks with slightly stronger or similar strength for the remaining models.

Code Of Conduct: I certify that all co-authors of this work have read and commit to adhering to the NeurIPS Statement on Ethics, Fairness, Inclusivity, and Code of Conduct.

Supplementary Material: pdf

Code: https://github.com/eth-sri/adaptive-auto-attack

Community Implementations: [![CatalyzeX](/images/catalyzex_icon.svg) 1 code implementation](https://www.catalyzex.com/paper/automated-discovery-of-adaptive-attacks-on/code)

20 Replies

Loading