Less is More: Stealthy and Adaptive Clean-Image Backdoor Attacks with Few Poisoned

28 Sept 2024 (modified: 05 Feb 2025)Submitted to ICLR 2025EveryoneRevisionsBibTeXCC BY 4.0
Keywords: Backdoor Attack, Generative Adversarial Networks, Clean-Image Backdoor Attacks, Deep Neural Networks, InfoGAN, Poisoning Attack, Model Integrity
TL;DR: We introduce GCB, a GAN-based method for clean-image backdoor attacks that achieves high success rates with minimal impact on model accuracy by optimizing triggers.
Abstract: Deep neural networks are fundamental in security-critical applications such as facial recognition, autonomous driving, and medical diagnostics, yet they are vulnerable to backdoor attacks. Clean-image backdoor attack, a stealthy attack utilizing solely label manipulation to implant backdoors, renders models vulnerable to exploitation by malicious labelers. However, existing clean-image backdoor attacks likely lead to a noticeable drop in Clean Accuracy (CA), decreasing their stealthiness. In this paper, we show that clean-image backdoor attacks can achieve a negligible decrease in CA by poisoning only a few samples while still maintaining a high attack success rate. We introduce **G**enerative Adversarial **C**lean-Image **B**ackdoors (GCB), a novel attack method that minimizes the drop in CA to less than 1\% by optimizing the trigger pattern for easier learning by the victim model. Leveraging a variant of InfoGAN, we ensure that the trigger pattern we used has already been contained in some training images and can be easily separated from those feature patterns used for benign tasks. Our experiments demonstrate that GCB can be adapted to 5 datasets—including MNIST, CIFAR-10, CIFAR-100, GTSRB, and Tiny-ImageNet—5 different architectures, and 4 tasks, including classification, multi-label classification, regression, and segmentation. Furthermore, GCB demonstrates strong resistance to backdoor defenses, successfully evading all detection methods we know. Code: *anonymous.4open.science/r/GCB*.
Primary Area: applications to computer vision, audio, language, and other modalities
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 13961
Loading