Improving Adversarial Robustness via Information Bottleneck Distillation

Published: 21 Sept 2023, Last Modified: 27 Dec 2023NeurIPS 2023 posterEveryoneRevisionsBibTeX
Keywords: Information Bottleneck, Adversarial training, Adversarial robustness, Knowledge distillation
TL;DR: We introduce the robust prior in variational IB and present the Information Bottleneck Distillation to improve adversarial robustness
Abstract: Previous studies have shown that optimizing the information bottleneck can significantly improve the robustness of deep neural networks. Our study closely examines the information bottleneck principle and proposes an Information Bottleneck Distillation approach. This specially designed, robust distillation technique utilizes prior knowledge obtained from a robust pre-trained model to boost information bottlenecks. Specifically, we propose two distillation strategies that align with the two optimization processes of the information bottleneck. Firstly, we use a robust soft-label distillation method to increase the mutual information between latent features and output prediction. Secondly, we introduce an adaptive feature distillation method that automatically transfers relevant knowledge from the teacher model to the student model, thereby reducing the mutual information between the input and latent features. We conduct extensive experiments to evaluate our approach's robustness against state-of-the-art adversarial attackers such as PGD-attack and AutoAttack. Our experimental results demonstrate the effectiveness of our approach in significantly improving adversarial robustness. Our code is available at
Submission Number: 1797