TL;DR: We identify a family of defense techniques and show that both deterministic lossy compression and randomized perturbations to the input lead to similar gains in robustness.
Abstract: The existence of adversarial examples, or intentional mis-predictions constructed from small changes to correctly predicted examples, is one of the most significant challenges in neural network research today. Ironically, many new defenses are based on a simple observation - the adversarial inputs themselves are not robust and small perturbations to the attacking input often recover the desired prediction. While the intuition is somewhat clear, a detailed understanding of this phenomenon is missing from the research literature. This paper presents a comprehensive experimental analysis of when and why perturbation defenses work and potential mechanisms that could explain their effectiveness (or ineffectiveness) in different settings.
Code: https://github.com/anonymous-user-commits/stochastic-channels-iclr
Keywords: adversarial examples, defenses, stochastic channels, deterministic channels, input transformations, compression, noise, convolutional neural networks
Original Pdf: pdf
11 Replies
Loading