Abstract: Over the past four years, neural networks have been proven vulnerable to adversarial images: targeted but imperceptible image perturbations lead to drastically different predictions. We show that adversarial vulnerability increases with the gradients of the training objective when viewed as a function of the inputs. For most current network architectures, we prove that the L1-norm of these gradients grows as the square root of the input size. These nets therefore become increasingly vulnerable with growing image size. Our proofs rely on the network’s weight distribution at initialization, but extensive experiments confirm that our conclusions still hold after usual training.
Keywords: adversarial vulnerability, neural networks, gradients, FGSM, adversarial data-augmentation, gradient regularization, robust optimization
TL;DR: Neural nets have large gradients by design; that makes them adversarially vulnerable.
Data: [CIFAR-10](https://paperswithcode.com/dataset/cifar-10)
Community Implementations: [![CatalyzeX](/images/catalyzex_icon.svg) 1 code implementation](https://www.catalyzex.com/paper/adversarial-vulnerability-of-neural-networks/code)
27 Replies
Loading