Demystifying the Adversarial Robustness of Random Transformation Defenses

Chawin Sitawarin; Zachary Golan-Strieb; David Wagner

Demystifying the Adversarial Robustness of Random Transformation Defenses

Chawin Sitawarin, Zachary Golan-Strieb, David Wagner

Published: 02 Dec 2021, Last Modified: 14 Jul 2025AAAI-22 AdvML Workshop OralReaders: Everyone

Keywords: adversarial examples, robustness, defense evaluation, random image transformation

TL;DR: We show that defenses against adversarial examples that utilize random image transformations are not robust as they were previously believed to be by showing ineffectiveness of BPDA and proposing a new state-of-the-art attack.

Abstract: Current machine learning models suffer from evasion attacks (i.e., adversarial examples) raising concerns in security-sensitive settings such as autonomous vehicles. While many countermeasures may look promising, only a few withstand rigorous evaluation. Recently, defenses using random transformations (RT) have shown impressive results, particularly BaRT (Raff et al. 2019) on ImageNet. However, this type of defense has not been rigorously evaluated, leaving its robustness properties poorly understood. The stochasticity of these models also makes evaluation more challenging and many proposed attacks on deterministic models inapplicable. First, we show that the BPDA attack (Athalye, Carlini, and Wagner 2018) used in BaRT’s evaluation is ineffective and likely overestimates its robustness. We then attempt to construct the strongest possible RT defense through the informed selection of transformations and Bayesian optimization for tuning their parameters. Furthermore, we create the strongest possible attack to evaluate our RT defense. Our new attack vastly outperforms the baseline, reducing the accuracy by $83\%$ compared to the $19\%$ reduction by the commonly used EoT attack ($4.3\times$ improvement). Our result indicates that the RT defense on Imagenettedataset (a ten-class subset of ImageNet) is not robust against adversarial examples. Extending the study further, we use our new attack to adversarially train RT defense (called AdvRT). However, the attack is still not sufficiently strong, and thus, the AdvRT model is no more robust than its RT counterpart. In the process of formulating our defense and attack, we perform several ablation studies and uncover insights that we hope will broadly benefit scientific communities studying stochastic neural networks and their robustness properties

Community Implementations: [![CatalyzeX](/images/catalyzex_icon.svg) 1 code implementation](https://www.catalyzex.com/paper/arxiv:2207.03574/code)

3 Replies

Loading