Pretrain-to-Finetune Adversarial Training via Sample-wise Randomized Smoothing

Lei Wang; Runtian Zhai; Di He; Liwei Wang; Li Jian

Pretrain-to-Finetune Adversarial Training via Sample-wise Randomized Smoothing

Lei Wang, Runtian Zhai, Di He, Liwei Wang, Li Jian

28 Sept 2020 (modified: 05 May 2023)ICLR 2021 Conference Blind SubmissionReaders: Everyone

Keywords: Adversarial Robustness, Provable Adversarial Defense, Sample-wise Randomized Smoothing.

Abstract: Developing certified models that can provably defense adversarial perturbations is important in machine learning security. Recently, randomized smoothing, combined with other techniques (Cohen et al., 2019; Salman et al., 2019), has been shown to be an effective method to certify models under $l_2$ perturbations. Existing work for certifying $l_2$ perturbations added the same level of Gaussian noise to each sample. The noise level determines the trade-off between the test accuracy and the average certified robust radius. We propose to further improve the defense via sample-wise randomized smoothing, which assigns different noise levels to different samples. Specifically, we propose a pretrain-to-finetune framework that first pretrains a model and then adjusts the noise levels for higher performance based on the model’s outputs. For certification, we carefully allocate specific robust regions for each test sample. We perform extensive experiments on CIFAR-10 and MNIST datasets and the experimental results demonstrate that our method can achieve better accuracy-robustness trade-off in the transductive setting.

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics

One-sentence Summary: Propose sample-wise randomized smoothing and achieve better accuracy-robustness trade-off.

Supplementary Material: zip

Reviewed Version (pdf): https://openreview.net/references/pdf?id=mD2ts5pl4H

11 Replies

Loading