The Stronger the Diffusion Model, the Easier the Backdoor: Data Poisoning to Induce Copyright Breaches Without Adjusting Finetuning Pipeline

Published: 28 Oct 2023, Last Modified: 13 Mar 2024NeurIPS 2023 BUGS OralEveryoneRevisionsBibTeX
Keywords: Backdoor attack, Diffusion Model, Copyright
TL;DR: SilentBadDiffusion, a backdoor attack on diffusion models, can stealthily trigger these models to reproduce copyrighted images, exposing serious flaws in current copyright protection strategies.
Abstract: The commercialization of diffusion models, renowned for their ability to generate high-quality images that are often indistinguishable from real ones, brings forth potential copyright concerns. Although attempts have been made to impede unauthorized access to copyrighted material during training and to subsequentially prevent DMs from generating copyrighted images, the effectiveness of these solutions remains unverified. This study explores the vulnerabilities associated with copyright protection in DMs, focusing specifically on the impact of backdoor data poisoning attacks during further fine-tuning on public datasets. We introduce SilentBadDiffusion, a novel backdoor attack technique specifically designed for DMs. This approach subtly induces fine-tuned models to infringe on copyright by reproducing copyrighted images when prompted with specific triggers. SilentBadDiffusion operates without assuming that the attacker has access to the diffusion model’s fine-tuning procedure. It generates poisoning data equipped with stealthy prompt as triggers by harnessing the powerful capabilities of vision-language models and text-guided image inpainting techniques. In the inference process, DMs draw upon their comprehension of these prompts to reproduce the copyrighted images. Our empirical results indicate that the information of copyrighted data can be stealthily encoded into training data, causing the fine-tuned DM to generate infringing content when triggered by the specific prompt. These findings underline potential pitfalls in the prevailing copyright protection strategies and underscore the necessity for increased scrutiny and preventative measures against the misuse of DMs.
Submission Number: 13