From Malicious to Marvelous: The Art of Adversarial Attack as Diffusion

22 Sept 2023 (modified: 25 Mar 2024)ICLR 2024 Conference Withdrawn SubmissionEveryoneRevisionsBibTeX
Keywords: Adversarial Robustness, CIFAR-10, Reverse Adversarial Process, Diffusion Model
Abstract: The ubiquitous presence of adversarial attacks in deep learning has been a source of frustration and challenge for researchers for years. However, in this work, we establish a new connection between adversarial attacks and the intricate process of diffusion. Specifically, we formulate an adversarial attack as a diffusion process, and by reverting this adversarial attack process, we have devised an innovative defense mechanism that stands out as a general-purpose defense against both black-box and white-box attacks. We call this new mechanism a Reverse Adversarial Process (RAP), which is ensured by a theoretical treatment for deploying denoising diffusion models on arbitrary distributions. Empirically, we found our model successfully defends against adversarial attacks with an unprecedented level of accuracy. For example, our approach has demonstrated exceptional performance on the \textit{RobustBench}, a highly-regarded leaderboard for assessing adversarial robustness, outperforming previous state-of-the-art methods by a clear margin.
Primary Area: general machine learning (i.e., none of the above)
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2024/AuthorGuide.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors' identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 5643
Loading