Towards Reliable Transferability of Targeted Adversarial Attacks against Model Discrepancy

Shixin Li; Xiaojing Ma; Pingyi Hu; Xiaofan Bai; Songfeng Lu; Dongmei Zhang; Bin Benjamin Zhu

Towards Reliable Transferability of Targeted Adversarial Attacks against Model Discrepancy

Shixin Li, Xiaojing Ma, Pingyi Hu, Xiaofan Bai, Songfeng Lu, Dongmei Zhang, Bin Benjamin Zhu

20 Sept 2025 (modified: 11 Feb 2026)Submitted to ICLR 2026EveryoneRevisionsBibTeXCC BY 4.0

Keywords: Targeted Adversarial Attacks, Transferability

Abstract: Adversarial attacks pose a serious threat to deep neural networks, especially in black-box scenarios where transferability plays a key role. Targeted transfer attacks, where an attacker induces a specific misclassification on an unseen black-box model, remains significantly more challenging than non-targeted attacks. We attribute this gap to model discrepancies between surrogate and target models, including mismatches in feature representations, classifier heads, and Jacobians. To address these challenges, we define a unified uncertainty set capturing these model discrepancies and propose a principled robust objective over this set. While intractable in full form, this view leads to a tractable relaxation: the Targeted Attack toward Reliable Transferability (TART). TART integrates three components: (1) expectation over transforms to cover representation and Jacobian variability; (2) latent mixing to model attenuation and clean-feature leakage; and (3) feature matching}to guide perturbations toward semantically robust regions. Extensive experiments on ImageNet and CIFAR-10 show that TART consistently outperforms state-of-the-art transfer-based black-box targeted attacks, across both convolutional and transformer architectures. For example, when transferring from ResNet-50 to Swin-S on ImageNet, TART achieves a 42.7\% higher attack success rate than the strongest baseline. Our approach establishes a new benchmark for robust black-box adversarial evaluation.

Primary Area: alignment, fairness, safety, privacy, and societal considerations

Submission Number: 23877

Loading