Abstract: Text-to-image generative models have garnered immense attention for their ability to produce high-fidelity images from text prompts and enjoyed great popularity among the community. Unfortunately, previous studies have demonstrated that text-to-image models suffer from backdoor attacks, which enforce the text-guided generative models to generate images that align the backdoor target via embedding the textual triggers. However, the currently proposed backdoor attacks rely on numerous training data and complex computing resources for poisoning the core components in generative models, limiting the effectiveness and practicality in real-world scenarios. In this work, we first investigate the backdoor attack against Text-to-image generation by manipulating text tokenizer. Our backdoor attack exploits the semantic conditioning role of text tokenizer in the text-to-image generation. We propose an Automatized Remapping Framework with Optimized Tokens (AROT) for finding the best target tokens to remap the trigger token in the mapping space, according to different tasks. We conduct extensive experiments on Stable Diffusion and two defined tasks to demonstrate the effectiveness, stealthiness and robustness of our attack.
External IDs:dblp:conf/icmcs/HeJHSGL25
Loading