Demystifying Poisoning Backdoor Attacks from a Statistical Perspective

Ganghua Wang; Xun Xian; Ashish Kundu; Jayanth Srinivasa; Xuan Bi; Mingyi Hong; Jie Ding

Demystifying Poisoning Backdoor Attacks from a Statistical Perspective

Ganghua Wang, Xun Xian, Ashish Kundu, Jayanth Srinivasa, Xuan Bi, Mingyi Hong, Jie Ding

Published: 16 Jan 2024, Last Modified: 11 Mar 2024ICLR 2024 posterEveryoneRevisionsBibTeX

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.

Keywords: backdoor attack, machine learning safety, asymptotic, statistical risk

Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2024/AuthorGuide.

TL;DR: We provide both finite-sample and asymptotic statistical analysis for theoretically understanding the backdoor attack.

Abstract: Backdoor attacks pose a significant security risk to machine learning applications due to their stealthy nature and potentially serious consequences. Such attacks involve embedding triggers within a learning model with the intention of causing malicious behavior when an active trigger is present while maintaining regular functionality without it. This paper derives a fundamental understanding of backdoor attacks that applies to both discriminative and generative models, including diffusion models and large language models. We evaluate the effectiveness of any backdoor attack incorporating a constant trigger, by establishing tight lower and upper boundaries for the performance of the compromised model on both clean and backdoor test data. The developed theory answers a series of fundamental but previously underexplored problems, including (1) what are the determining factors for a backdoor attack's success, (2) what is the direction of the most effective backdoor attack, and (3) when will a human-imperceptible trigger succeed. We demonstrate the theory by conducting experiments using benchmark datasets and state-of-the-art backdoor attack scenarios. Our code is available \href{https://github.com/KeyWgh/DemystifyBackdoor}{here}.

Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors' identity.

Supplementary Material: zip

No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.

Primary Area: societal considerations including fairness, safety, privacy

Submission Number: 8404

Loading