Human imperceptible attacks and applications to improve fairnessDownload PDF

29 Sept 2021 (modified: 13 Feb 2023)ICLR 2022 Conference Desk Rejected SubmissionReaders: Everyone
Abstract: Modern neural networks are able to perform at least as well as humans in numerous tasks involving object classification and image generation. However, there is also evidence that perturbations which are imperceptible to humans may significantly degrade the performance of well-trained deep neural networks. We provide a Distributionally Robust Optimization (DRO) framework which integrates human-based image quality assessment methods to design optimal attacks that are imperceptible to humans but significantly damaging to deep neural networks. Our attack algorithm can generate better-quality (less perceptible to humans) attacks than other state-of-the-art human imperceptible attack methods. We provide an algorithmic implementation of independent interest which can speed up DRO training significantly. Finally, we demonstrate how the use of optimally designed human imperceptible attacks can improve group fairness in image classification while maintaining a similar accuracy.
Supplementary Material: zip
1 Reply

Loading