Countering the Attack-Defense Complexity Gap for Robust ClassifiersDownload PDF

Published: 01 Feb 2023, Last Modified: 13 Feb 2023Submitted to ICLR 2023Readers: Everyone
Keywords: adversarial attacks, adversarial robustness, computational complexity, dataset
TL;DR: We provide a formal rationale for why attacks are more efficient than defenses and introduce a new defensive technique that sidesteps this asymmetry.
Abstract: We consider the decision version of defending and attacking Machine Learning classifiers. We provide a rationale for the well-known difficulties in building robust models: in particular we prove that, under broad assumptions, attacking a polynomial-time classifier is $NP$-complete, while training a polynomial-time model that is robust on even a single input is $\Sigma_2^P$-complete. We also provide more general bounds for non-polynomial classifiers. We then show how such a complexity gap can be sidestepped by introducing Counter-Attack (CA), a system that computes on-the-fly robustness certificates for a given input up to an arbitrary distance bound $\varepsilon$. We also prove that, even when attacked with perturbations of magnitude $\varepsilon^\prime > \varepsilon$, CA still provides computational robustness: specifically, while computing a certificate is $NP$-complete, attacking the system beyond its intended robustness is $\Sigma_2^P$-complete. Since the exact form of CA can still be computationally expensive, we introduce a relaxation of this method, which we empirically show to be reliable at identifying non-robust inputs. As part of our work, we introduce UG100, a new dataset obtained by applying a provably optimal attack to six limited-scale networks (three for MNIST and three for CIFAR10), each trained in three different manners.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics
Submission Guidelines: Yes
Please Choose The Closest Area That Your Submission Falls Into: Theory (eg, control theory, learning theory, algorithmic game theory)
15 Replies

Loading