Confidence-driven Sampling for Backdoor Attacks

Pengfei He; Han Xu; Jie Ren; Yingqian Cui; Shenglai Zeng; Yue Xing; Jiliang Tang; Makoto Yamada; Mohammad Sabokrou

Confidence-driven Sampling for Backdoor Attacks

Pengfei He, Han Xu, Jie Ren, Yingqian Cui, Shenglai Zeng, Yue Xing, Jiliang Tang, Makoto Yamada, Mohammad Sabokrou

22 Sept 2023 (modified: 11 Feb 2024)Submitted to ICLR 2024EveryoneRevisionsBibTeX

Primary Area: societal considerations including fairness, safety, privacy

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.

Keywords: Backdoor attack, sampling method

Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2024/AuthorGuide.

Abstract: Backdoor attacks aim to surreptitiously insert malicious triggers into DNN models, granting unauthorized control during testing scenarios. Existing methods lack robustness against defense strategies and predominantly focus on enhancing trigger stealthiness while randomly selecting poisoned samples. Our research highlights the overlooked drawbacks of random sampling, which make that attack detectable and defensible. The core idea of this paper is to strategically poison samples near the model's decision boundary and increase defense difficulty. We introduce a straightforward yet highly effective sampling methodology that leverages confidence scores. Specifically, it selects samples with lower confidence scores, significantly increasing the challenge for defenders in identifying and countering these attacks. Importantly, our method operates independently of existing trigger designs, providing versatility and compatibility with various backdoor attack techniques. We substantiate the effectiveness of our approach through a comprehensive set of empirical experiments, demonstrating its potential to significantly enhance resilience against backdoor attacks in DNNs.

Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors' identity.

No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.

Submission Number: 6110

Loading