On the Power of Abstention and Data-Driven Decision Making for Adversarial Robustness

Nina Balcan; Avrim Blum; Dravyansh Sharma; Hongyang Zhang

On the Power of Abstention and Data-Driven Decision Making for Adversarial Robustness

Nina Balcan, Avrim Blum, Dravyansh Sharma, Hongyang Zhang

28 Sept 2020 (modified: 22 Jun 2025)ICLR 2021 Conference Blind SubmissionReaders: Everyone

Keywords: Adversarial Machine Learning, Learning Theory

Abstract: We formally define a feature-space attack where the adversary can perturb datapoints by arbitrary amounts but in restricted directions. By restricting the attack to a small random subspace, our model provides a clean abstraction for non-Lipschitz networks which map small input movements to large feature movements. We prove that classifiers with the ability to abstain are provably more powerful than those that cannot in this setting. Specifically, we show that no matter how well-behaved the natural data is, any classifier that cannot abstain will be defeated by such an adversary. However, by allowing abstention, we give a parameterized algorithm with provably good performance against such an adversary when classes are reasonably well-separated in feature space and the dimension of the feature space is high. We further use a data-driven method to set our algorithm parameters to optimize over the accuracy vs. abstention trade-off with strong theoretical guarantees. Our theory has direct applications to the technique of contrastive learning, where we empirically demonstrate the ability of our algorithms to obtain high robust accuracy with only small amounts of abstention in both supervised and self-supervised settings. Our results provide a first formal abstention-based gap, and a first provable optimization for the induced trade-off in an adversarial defense setting.

One-sentence Summary: We develop algorithms with provable guarantees for defense against adversarial attacks that utilize abstention and also provably learn parameters to optimize over the accuracy vs. abstention trade-off.

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics

Supplementary Material: zip

Community Implementations: [![CatalyzeX](/images/catalyzex_icon.svg) 1 code implementation](https://www.catalyzex.com/paper/on-the-power-of-abstention-and-data-driven/code)

Reviewed Version (pdf): https://openreview.net/references/pdf?id=S4CQ54pSQ6

14 Replies

Loading