Keywords: Data Poisoning, Adversarial Attacks, Backdoor Attacks
TL;DR: We propose a novel backdoor attack against image classifiers which can be activated using multiple triggers
Abstract: Targeted data poisoning poses a critical adversarial threat to machine learning systems by enabling attackers to manipulate training data to induce specific, harmful misclassifications. Among these threats, backdoor attacks are particularly pernicious, embedding hidden triggers in the data that lead models to misclassify only those inputs containing the trigger, while maintaining high accuracy on benign samples. In this paper, we propose Gradient Storm, a novel technique that facilitates the simultaneous execution of multiple backdoor attacks, while necessitating only minimal modification to the training dataset. Our contributions are twofold: First, we introduce a method for designing adversarial poisons in modular components, each tailored based on a distinct region of the model’s parameter space. Second, we present a framework for conducting multi-trigger attacks, where each trigger causes misclassification from a specific source class to a distinct target class. We evaluate the efficacy of Gradient Storm across multiple neural network architectures and two benchmark datasets, demonstrating its robustness against eight different poisoning defense mechanisms. Additionally, we show that poisons crafted for one model can be effectively transferred to other models, demonstrating that our attack remains effective even in black-box settings.
Supplementary Material: zip
Primary Area: alignment, fairness, safety, privacy, and societal considerations
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 4633
Loading