The Interplay between Distribution Parameters and the Accuracy-Robustness Tradeoff in Classification
Keywords: Adversarial Robustness, Accuracy-Robustness Tradeoff, Natural Accuracy Gap
TL;DR: Under the general Gaussian class conditional distributions, and $\ell_\infty$ attack of size less than a small enough $\epsilon$, the natural accuracy gap of optimal Bayes and optimal robust classifiers is $\Theta(\epsilon^2)$.
Abstract: Adversarial training tends to result in models that are less accurate on natural (unperturbed) examples compared to standard models. This can be attributed to either an algorithmic shortcoming or a fundamental property of the training data distribution, which admits different solutions for optimal standard and adversarial classifiers. In this work, we focus on the latter case under a binary Gaussian mixture classification problem. Unlike earlier work, we aim to derive the natural accuracy gap between the optimal Bayes and adversarial classifiers, and study the effect of different distributional parameters, namely separation between class centroids, class proportions, and the covariance matrix, on the derived gap. We show that under certain conditions, the natural error of the optimal adversarial classifier, as well as the gap, are locally minimized when classes are balanced, contradicting the performance of the Bayes classifier where perfect balance induces the worst accuracy. Moreover, we show that with an $\ell_\infty$ bounded perturbation and an adversarial budget of $\epsilon$, this gap is $\Theta(\epsilon^2)$ for the worst-case parameters, which for suitably small $\epsilon$ indicates the theoretical possibility of achieving robust classifiers with near-perfect accuracy, which is rarely reflected in practical algorithms.
2 Replies
Loading