Perturbation Diversity Certificates Robust GeneralisationDownload PDF

29 Sept 2021 (modified: 13 Feb 2023)ICLR 2022 Conference Withdrawn SubmissionReaders: Everyone
Keywords: adversarial example, robust generalisation, adversarial training
Abstract: Whilst adversarial training has been proven the most effective defending method against adversarial attacks for deep neural networks, it suffers from overfitting on unseen adversarial data and thus may not guarantee robust generalisation. It is possibly due to the fact that the conventional adversarial training methods generate adversarial perturbations usually in a supervised way, so that the adversarial samples are highly biased towards the decision boundary, resulting in an inhomogeneous data distribution. To mitigate this limitation, we propose a novel adversarial training method from a perturbation diversity perspective. Specifically, we generate perturbed samples not only adversarially but also diversely, so as to certificate significant robustness improvement through a homogeneous data distribution. We provide both theoretical and empirical analysis which establishes solid foundation to well support the proposed method. To verify our methods’ effectiveness, we conduct extensive experiments over different datasets (e.g., CIFAR-10, CIFAR-100, SVHN) with different adversarial attacks (e.g., PGD, CW). Experimental results show that our method outperforms other state-of-the-arts (e.g., PGD and Feature Scattering) in robust generalisation performance. (Source codes are available in the supplementary material.)
Supplementary Material: zip
5 Replies

Loading