Keywords: Adversarial perturbation, Dictionary learning, Deep learning, Robust model, Transferability
Verify Author List: I have double-checked the author list and understand that additions and removals will not be allowed after the submission deadline.
TL;DR: The paper explores the space of adversarial perturbations within a dataset.
Abstract: Deep Neural Network (DNN) models are vulnerable to deception through the intentional addition of imperceptible perturbations to benign examples, posing a significant threat to security-sensitive applications. To address this, understanding the underlying causes of this phenomenon is crucial for developing robust models. A key research area involves investigating the characteristics of adversarial directions, which have been found to be perpendicular to decision boundaries and associated with low-density regions of the data. Existing research primarily focuses on adversarial directions for individual examples, while decision boundaries and data distributions are inherently dataset-dependent. This paper explores the space of adversarial perturbations within a dataset. Specifically, we represent adversarial perturbations as a linear combination of adversarial directions, followed by a non-linear projection. Using the proposed greedy algorithm, we train the adversarial space spanned by the set of adversarial directions.
Experiments on Cifar10 and ImageNet substantiate the existence of the adversarial space as an embedded space within the entire data space. Furthermore, the learned adversarial space enables statistical analysis of decision boundaries. Finally, we observe that the adversarial space learned on one DNN model is model-agnostic, and that the adversarial space learned on a vanilla model is a subset of that learned on a robust model, implicating data distribution as the underlying cause of adversarial examples.
A Signed Permission To Publish Form In Pdf: pdf
Url Link To Your Supplementary Code: https://github.com/flavie-yuan-liu/Adversarial-Space.git
Primary Area: Trustworthy Machine Learning (accountability, explainability, transparency, causality, fairness, privacy, robustness, autoML, etc.)
Paper Checklist Guidelines: I certify that all co-authors of this work have read and commit to adhering to the guidelines in Call for Papers.
Student Author: No
Submission Number: 336
Loading