Megatron: Evasive Clean-Label Backdoor Attacks Against Vision Transformer

Xueluan Gong, Bowei Tian, Meng Xue, Shuaike Li, Yanjiao Chen, Qian Wang

Published: 2026, Last Modified: 12 Mar 2026IEEE Trans. Dependable Secur. Comput. 2026EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Vision transformers have achieved impressive performance in various vision-related tasks, but their vulnerability to backdoor attacks is under-explored. A handful of existing works focus on dirty-label attacks with wrongly-labeled poisoned training samples, which may fail if a benign model trainer corrects the labels. In this paper, we propose Megatron, an evasive clean-label backdoor attack against vision transformers, where the attacker injects the backdoor without manipulating the data-labeling process. To generate an effective trigger, we employ a local surrogate vision transformer to approximate the victim model and customize two attention-based loss terms: latent loss and attention diffusion loss. The latent loss aligns the last attention layer between triggered samples and clean samples of the target label. The attention diffusion loss emphasizes the attention diffusion area that encompasses the trigger. A theoretical analysis is provided to underpin the rationale behind the attention diffusion loss. Extensive experiments on CIFAR-10, GTSRB, CIFAR-100, and Tiny ImageNet demonstrate the effectiveness of Megatron. Megatron can achieve attack success rates of over 90% even when the position of the trigger is slightly shifted during testing. Furthermore, Megatron achieves better evasiveness than baselines regarding both human visual inspection and defense strategies (i.e., DBAVT, BAVT, Beatrix, TeCo, and SAGE).

External IDs:dblp:journals/tdsc/GongTXLCW26