LTD: Low Temperature Distillation for Gradient Masking-free Adversarial Training

TMLR Paper917 Authors

05 Mar 2023 (modified: 18 Jun 2023)Rejected by TMLREveryoneRevisionsBibTeX
Abstract: Adversarial training has been widely used to enhance the robustness of neural network models against adversarial attacks. However, there is still a notable gap between nature accuracy and robust accuracy. We found one of the reasons is the commonly used labels, one-hot vectors, hinder the learning process for image recognition. Representing an ambiguous image with the one-hot vector is imprecise and the model may fall into a suboptimal solution. In this paper, we propose a method, called Low Temperature Distillation (LTD), which is based on the knowledge distillation framework to generate the desired soft labels. Unlike the previous work, LTD uses a relatively low temperature in the teacher model, and employs different, but fixed, temperatures for the teacher and student models. This modification boosts the robustness without defensive distillation. Moreover, we have investigated the methods to synergize the use of nature data and adversarial ones in LTD. Experimental results show that without extra unlabeled data, the proposed method combined with the previous works achieve 58.19%; 31.13% and 42.08% robust accuracy on CIFAR-10; CIFAR-100 and ImageNet data sets respectively.
Submission Length: Regular submission (no more than 12 pages of main content)
Changes Since Last Submission: The modification for the third version: a. Revised the statement of the closed-world assumption and the definition of adversarial examples for clarity. b. Reorganized the introduction for adversarial training to establish a stronger connection to the gradient mask issue. c. Moved the analysis and simulations to Appendix B and provided additional details to our simulations to support the benefits of using soft label representation in real-world scenarios. Also included the results of the temperature selection analysis to reinforce our findings. d. Added an ablation study in λ to provide further insight into the effects of this parameter on our model's performance. e. Added more details to the discussions section to provide a comprehensive understanding of our research and its implications. ---- The modification for the second version: a. We have removed the presentation of inconsistent batch normalization (Sections 3.4, 3.5, and 5.3.1 in the original manuscript). Additionally, we have combined Sections 3.1 and 3.2 in the original manuscript into Section 3.1 (Classification Problem in Real-world Scenario) and Sections 4.1 and 4.2 in the original manuscript into Section 4.1 (Training Framework). b. We have polished Section 3.2 (Oracle Distribution Estimation) and added simulation evidence to reinforce our claim that soft labels are superior to one-hot vectors. c. We have added a section titled "Comparison with Existing Works" in Section 4.3. d. We have added a depth discussion on potential limitations and future directions in Section 5.3 (Discussion). f. We have taken great care to eliminate any grammatical errors in the manuscript.
Assigned Action Editor: ~Simon_Kornblith1
Submission Number: 917
Loading