Balancing Generalization and Robustness in Adversarial Training via Steering through Clean and Adversarial Gradient Directions
Abstract: Adversarial training (AT) is a fundamental method to enhance the robustness of Deep Neural Networks (DNNs) against adversarial examples. While AT achieves improved robustness on adversarial examples, it often leads to reduced accuracy on clean examples. Considerable effort has been devoted to handling the trade-off from the perspective of \textit{input space}. However, we demonstrate that the trade-off can also be illustrated from the perspective of the \textit{gradient space}. In this paper, we propose Adversarial Training with Adaptive Gradient Reconstruction (\textit{AGR}), a novel approach that balances generalization (accuracy on clean examples) and robustness (accuracy on adversarial examples) in adversarial training via steering through clean and adversarial gradient directions. We first introduce a ingenious technique named Gradient Orthogonal Projection in the case of negative correlation gradients to adjust the adversarial gradient direction to reduce the degradation of generalization. Then we present a gradient interpolation scheme in the case of positive correlation gradients for efficiently increasing the generalization without compromising the robustness of the final obtained. Rigorous theoretical analysis prove that our \textit{AGR} has lower generalization error upper bounds indicating its effectiveness. Comprehensive experiments empirically demonstrate that \textit{AGR} achieves excellent capability of balancing generalization and robustness, and is compatible with various adversarial training methods to achieve superior performance. Our codes are available at: \url{https://github.com/RUIYUN-ML/AGR}.
Primary Subject Area: [Content] Vision and Language
Secondary Subject Area: [Content] Media Interpretation
Relevance To Conference: This work focuses on the field of information security for multimedia, aiming to improve the security preservation of multimedia visual models while enhancing their practicality and generalization.
Supplementary Material: zip
Submission Number: 1984
Loading