Camouflage Is All You Need: Evaluating and Enhancing Transformer Models Robustness Against Camouflage Adversarial Attacks

Published: 2025, Last Modified: 06 Jan 2026IEEE Trans. Emerg. Top. Comput. Intell. 2025EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Advanced language models demonstrate remarkable capabilities but remain vulnerable to adversarial word camouflage techniques. These techniques introduce visually perceptible language manipulations while conveying intended meanings to the target audience, potentially altering a model's output. This study explores the effectiveness and limitations of word camouflage in deceiving various language model architectures, including encoder-decoder, encoder-only, and decoder-only models, with a significant focus on the tokenizers employed. Despite their vocabulary diversity, all tokenizers exhibited notable weaknesses against camouflaged words, particularly keyword attacks, highlighting the urgent need for adaptable tokenizers to handle sophisticated adversarial strategies. Consistent with our findings, transformer models without specific training against word camouflage become increasingly compromised as the complexity and volume of camouflaged inputs grow. To address these vulnerabilities, we evaluated external countermeasures such as MASK and BLANK filters, demonstrating that semantic content persists in camouflaged text and can be exploited by models. We employed static and dynamic adversarial training methods, with static training introducing camouflaged data once, while dynamic training continuously updates the data during training. Our results showed that dynamic training effectively counters adversarial attacks and enhances overall model performance, suggesting its dual role as a defensive mechanism and a data augmentation technique. The methodology incorporates the AugLy library for external validation, demonstrating the superior efficacy of dynamic training over static methods. Key contributions include enhancing the open-source tool pyleetspeak to facilitate the creation of augmented camouflaged datasets, providing researchers and practitioners with effective tools to strengthen NLP systems against evolving threats in digital communication.
Loading