Keywords: Adversarial example, white-box attack, neural networks, discrete linear transforms, DCT, JPEG, wavelet
TL;DR: A novel set of attacks operating in the well-known DCT DWTs domains that do not abolish the usual $L^\infty$ threat model, leading to adversarial examples with higher visual similarity and better adversarial learning transferability.
Abstract: Classical image transformation such as the discrete cosine transform (DCT) and the discrete wavelet transforms (DWTs) provide semantically meaningful representations of images. In this paper we propose a general method for adversarial attacks in such transform domains that, in contrast to prior work, obey the $L^\infty$ constraint in the pixel domain. The key idea is to replace the standard attack based on projections with the barrier method. Experiments with DCT and DWTs produce adversarial examples that are significantly more similar to the original than with prior attacks. Further, through adversarial training we show that robustness against our attacks transfers to robustness against a broad class of common image perturbations.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics
Submission Guidelines: Yes
Please Choose The Closest Area That Your Submission Falls Into: Deep Learning and representational learning
5 Replies
Loading