Backdoor Attacks on Large Language Model Based Semantic Communication Systems

Duc Trong Minh Hoang, Long Bao Le

Published: 2025, Last Modified: 04 Nov 2025ICC Workshops 2025EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: We propose an efficient backdoor attack on a task-oriented semantic communication system, designed by employing Large Language Models (LLMs) (e.g., BERT and RoBERTa) and self-attention techniques. To achieve a backdoor model, we assume an adversary creates a clean model then embeds backdoors into it in pre-training phases one and two, respectively. The LLM-based semantic communication system is pre-trained in phase one by using a loss function with the standard cross-entropy loss and a smoothness-inducing adversarial component so that it is robust against Gaussian noise and wireless fading. The salient feature of our design is that the proposed backdoor attack remains effective even if the backdoor system is further fine-tuned for different downstream tasks. This attack effectiveness is achieved thanks to a smart training procedure in the pre-training phase two that maps a poisoned input to an output representation (OR) close to a predefined OR. This predefined OR is highly likely to differ from the true OR of the clean input, leading to incorrect prediction for the considered task. Our experiments show that the proposed attack strategy is efficient and stealthy for a wide range of signal-to-noise ratios (SNR) and different types of triggers and tasks. Moreover, we demonstrate that the proposed attack can withstand the model pruning and fine-tuning defense strategies.

External IDs:dblp:conf/icc/HoangL25