Improving Clean Accuracy via a Tangent-Space Perspective on Adversarial Training

Improving Clean Accuracy via a Tangent-Space Perspective on Adversarial Training

TMLR Paper5576 Authors

08 Aug 2025 (modified: 20 Oct 2025)Withdrawn by AuthorsEveryoneRevisionsBibTeXCC BY 4.0

Abstract: Adversarial training has proven effective in improving the robustness of deep neural networks against adversarial attacks. However, this enhanced robustness often comes at the cost of a substantial drop in accuracy on clean data. In this paper, we address this limitation by introducing Tangent Direction Guided Adversarial Training (TART), a novel and theoretically well-grounded method that enhances clean accuracy by exploiting the geometry of the data manifold. We argue that adversarial examples with large components in the normal direction can overly distort the decision boundary and degrade clean accuracy. TART addresses this issue by estimating the tangent direction of adversarial examples and adaptively modulating the perturbation bound based on the norm of their tangential component. To the best of our knowledge, TART is the first adversarial defense framework that explicitly incorporates the concept of tangent space and direction into adversarial training. Extensive experiments on both synthetic and benchmark datasets demonstrate that TART consistently improves clean accuracy while maintaining robustness against adversarial attacks.

Submission Type: Regular submission (no more than 12 pages of main content)

Assigned Action Editor: ~bo_han2

Submission Number: 5576

Loading