Adversarially Trained Models with Test-Time Covariate Shift AdaptationDownload PDF

29 Sept 2021 (modified: 13 Feb 2023)ICLR 2022 Conference Withdrawn SubmissionReaders: Everyone
Keywords: Adversarial Training, Certified Robustness
Abstract: Existing defense models against adversarial examples typically provide either empirical or certified robustness. Adversarially trained models empirically demonstrate state-of-the-art defense while providing no robustness guarantees for large classifiers or higher-dimensional inputs. In contrast, a randomized smoothing framework provides state-of-the-art certification while significantly degrades the empirical performance against adversarial attacks. In this work, we propose a novel \textit{certification through adaptation} technique that transforms an adversarially trained model into a randomized smoothing classifier during inference to provide certified robustness for $\ell_2$ norm without affecting their empirical robustness against adversarial attacks. One advantage of our proposed technique is that it allows us to separately choose the appropriate noise level for certifying each test example during inference. It also leads to outperform the existing randomized smoothing models for $\ell_2$ certification on CIFAR-10. Therefore, our work is a step towards bridging the gap between the empirical and certified robustness against adversarial examples by achieving both using the same classifier for the first time.
One-sentence Summary: Transforms an adversarially trained model into a randomized smoothing classifier during inference to provide certified robustness for $\ell_2$ without affecting their state-of-the-art empirical robustness against adversarial attacks.
Supplementary Material: zip
10 Replies

Loading