Backdoor Complications: A Comprehensive Analysis and Mitigation of the Unforeseen Consequences of Backdoor Attacks

Rui Zhang, Yun Shen, Hongwei Li, Wenbo Jiang, Hanxiao Chen, Yuan Zhang, Guowen Xu, Yang Zhang

Published: 01 Jan 2026, Last Modified: 11 Mar 2026IEEE Transactions on Dependable and Secure ComputingEveryoneRevisionsCC BY-SA 4.0
Abstract: Pre-trained language models (PTLMs) have become integral to modern natural language processing (NLP), yet their reuse exposes them to supply chain risks such as backdoor attacks. Existing studies assume that attackers target specific downstream tasks, overlooking how a backdoored PTLM behaves when fine-tuned for unrelated applications. In practice, such unintended adaptation can trigger anomalous and inconsistent predictions, revealing the backdoor and compromising its stealthiness. We define this phenomenon as backdoor complications, i.e., unintended behavioral side effects emerging on non-target tasks. This work presents the first systematic quantification and mitigation of backdoor complications. Through extensive experiments on 3 widely used PTLMs and 15 benchmark datasets, we show that complications are pervasive across both single- and multi-task attack settings, causing triggered outputs to collapse into arbitrary classes. To address this issue, we propose the Complication-Suppressed Backdoor Attack (CSBA), a task-agnostic, multi-objective framework that leverages auxiliary non-target datasets to suppress backdoor complications. CSBA effectively suppresses complications on unseen downstream tasks while maintaining near-perfect attack success rates. Our work reveals a critical side effect in backdoored PTLMs and provides a new perspective on the stealthiness and robustness of model supply chain security.
Loading