Alternating Guided Training for Robust Adversarial Defense
Abstract: Adversarial examples can lead neural networks to produce high-confidence incorrect predictions in image classification tasks. To defend against adversarial example attacks, we propose a deep learning defense method known as alternating guided training (AGT). AGT employs a fully convolutional neural network as the basic defense model (BDM) and utilizes a pre-trained classification model (e.g., ResNet) as the target substitute model (TSM). Through a series of iterative processes, the BDM and TSM are alternately trained to enhance the perturbation elimination capability of the former and to correct the classification decision boundary of the latter, thereby achieving an overall robust adversarial defense. In black-box scenarios utilizing ResNet-34 as the classification model, AGT achieves average defense rates exceeding 95.04% and 73.98% on CIFAR-10 and Mini-ImageNet, respectively, exhibiting the SOTA performance. The code is available at https://github.com/X-L-Liu/AGT.
Loading