Abstract: Adversarial training can boost the robustness of the model by aligning discriminative features between natural and generated adversarial samples. However, the generated adversarial samples tend to have more features derived from changed patterns in other categories along with the training process, which prevents better feature alignment between natural and adversarial samples. Unfortunately, existing adversarial training methods ignore such dynamicity of generated adversarial samples. In this paper, we propose Adversarial Contrastive Decoupling (ACD) to filter the features derived from changed patterns. Specificity, we decouple the changed patterns from adversarial samples and then extract robust representations from remaining features. First, we introduce a decoupling module with a dynamic labeling strategy to explore the dynamicity of generated adversarial samples. Then, we propose a siamese network with contrastive learning mechanism to align remaining robust representations between adversarial and natural samples. Extensive experimental results demonstrate the superior performance of ACD over baselines.
0 Replies
Loading