Certified Defense Against Complex Adversarial Attacks with Dynamic Smoothing

Francesco Quinzan; Marta Kwiatkowska

Certified Defense Against Complex Adversarial Attacks with Dynamic Smoothing

Francesco Quinzan, Marta Kwiatkowska

27 Sept 2024 (modified: 22 Nov 2024)ICLR 2025 Conference Withdrawn SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: AI safety, adversarial robustness, randomized smoothing

TL;DR: A novel approach based on radnomised smoothing to make vision classifiers robust against complex attacks.

Abstract: Randomized smoothing has emerged as a certified defence mechanism with probabilistic guarantees that works at scale. However, current randomized smoothing methods offer theoretical guarantees that are limited by their reliance on specific noise distributions, and they struggle to handle complex adversarial attacks. In this paper, we propose a novel certification method based on randomized smoothing designed to handle complex adversarial attacks, including combinations of multiple attack types. We call this method Dynamic Smoothing (DSmooth). Our key idea is to incorporate more general distributions for smoothing then isotopic Gaussian noise, for which probabilistic guarantees can be derived in terms of the Mahalanobis distance. These general distributions make the smoothed classifier more robust against a wide range of threats, including localized adversarial attacks and multi-attacks. We validate the performance of our method experimentally on challenging threat models using CIFAR-10 and ImageNet, and demonstrate its superiority over state-of-the-art defenses in terms of certified accuracy. Our results show that the proposed method significantly improves the robustness of machine learning models against complex attacks, advancing their suitability for use in safety-critical applications. Code: [removed for review]

Supplementary Material: zip

Primary Area: alignment, fairness, safety, privacy, and societal considerations

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.

Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.

Reciprocal Reviewing: I understand the reciprocal reviewing requirement as described on https://iclr.cc/Conferences/2025/CallForPapers. If none of the authors are registered as a reviewer, it may result in a desk rejection at the discretion of the program chairs. To request an exception, please complete this form at https://forms.gle/Huojr6VjkFxiQsUp6.

Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.

No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.

Submission Number: 10353

Loading