Generating Potent Poisons and Backdoors from Scratch with Guided Diffusion

Published: 28 Jun 2024, Last Modified: 25 Jul 2024NextGenAISafety 2024 OralEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Data poisoning, backdoor attacks, diffusion models, image generation, adversarial, security, safety
TL;DR: We use guided diffusion models to synthesize base samples from scratch that enable potent poisons.
Abstract: Modern neural networks are often trained on massive datasets that are web scraped with minimal human inspection. As a result of this insecure curation pipeline, an adversary can poison or backdoor the resulting model by uploading malicious data to the internet and waiting for a victim to scrape and train on it. Existing approaches for creating poisons and backdoors start with randomly sampled clean data, called base samples, and then modify those samples to craft poisons. However, some base samples may be significantly more amenable to poisoning than others. As a result, we may be able to craft more potent poisons by carefully choosing the base samples. In this work, we use guided diffusion to synthesize base samples from scratch that lead to significantly more potent poisons and backdoors than previous state-of-the-art attacks. Our Guided Diffusion Poisoning (GDP) base samples can be combined with any downstream poisoning or backdoor attack to boost its effectiveness.
Submission Number: 21
Loading