Keywords: Adversarial examples, Adversarial robustness, Asymptotic equipartition property
Abstract: Adversarial examples, which can mislead neural networks through subtle perturbations, continue to challenge our understanding, raising more questions than answers. This paper presents a novel perspective on interpreting adversarial examples through the Asymptotic Equipartition Property (AEP). Our theoretical analysis examines the noise within these examples, revealing that while normal noise aligns with AEP, adversarial noise does not. This insight allows us to classify samples in high-dimensional space as belonging to either the typical or non-typical set, corresponding to normal and adversarial examples, respectively. 
Our analyses and experiments show adversarial examples arise from AEP in high-dimensional space and derive some key properties regarding their quantity, probability, and information capacity. These findings enhance our understanding of adversarial examples and clarify their counterintuitive phenomena, such as adversarial transferability, the trade-off between robustness and accuracy, and robust overfitting.
Primary Area: alignment, fairness, safety, privacy, and societal considerations
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 5930
Loading