Boosting Backdoor Attack with A Learnable Poisoning Sample Selection Strategy

15 Sept 2023 (modified: 11 Feb 2024)Submitted to ICLR 2024EveryoneRevisionsBibTeX
Primary Area: societal considerations including fairness, safety, privacy
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Keywords: Backdoor attack, Data selection, Trustworthy AI
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2024/AuthorGuide.
TL;DR: We propose a learning poisoning sample selection strategy to boost data-poisoning based backdoor attacks via a min-max optimization.
Abstract: Data-poisoning based backdoor attacks aim to inject backdoor into models by manipulating training datasets without controlling the training process of the target model. Existing backdoor attacks mainly focus on designing diverse triggers or fusion strategies to generate poisoned samples. However, all these attacks randomly select samples from the benign dataset to be poisoned, disregarding the varying importance of different samples. In order to select important samples to be poisoned from a global perspective, we first introduce a learnable poisoning mask into the regular backdoor training loss. Then we propose a Learnable Poisoning sample Selection (LPS) strategy to learn the mask through a min-max optimization. During the two-player game, considering hard samples contribute more to the training process, the inner optimization maximizes loss w.r.t. the mask to identify hard poisoned samples by impeding the training objective, while the outer optimization minimizes the loss w.r.t. the model’s weight to train the surrogate model. After several rounds of adversarial training, we finally select poisoned samples with high contribution. Extensive experiments on benchmark datasets demonstrate the effectiveness and efficiency of our LPS strategy in boosting the performance of various data-poisoning based backdoor attacks.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors' identity.
Supplementary Material: zip
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 249
Loading