Triple Down on Robustness: Understanding the Impact of Adversarial Triplet Compositions on Adversarial Robustness
Abstract: Adversarial training, a widely used technique for fortifying the robustness of machine learning models, has seen its effectiveness further bolstered by modifying loss functions or incorporating additional terms into the training objective. While these adaptations are validated through empirical studies, they lack a solid theoretical basis to explain the models’ secure and robust behavior. In this paper, we investigate the integration of adversarial triplets within the adversarial training framework, a method previously shown to enhance robustness. However, the reasons behind this increased robustness are poorly understood, and the impact of different adversarial triplet configurations remains unclear. To address this gap, we utilize the robust and non-robust features framework to analyze how various adversarial triplet compositions influence robustness, providing deeper insights into the robustness guarantees of this approach. Specifically, we introduce a novel framework that explains how different compositions of adversarial triplets lead to distinct training dynamics, thereby affecting the model’s adversarial robustness. We validate our theoretical findings through empirical analysis, demonstrating that our framework accurately characterizes the effects of adversarial triplets on the training process. Our results offer a comprehensive explanation of how adversarial triplets influence the security and robustness of models, providing a theoretical foundation for methods that employ adversarial triplets to improve robustness. This research not only enhances our theoretical understanding but also has practical implications for developing more robust machine learning models.
Loading