Grond: A Stealthy Backdoor Attack in Model Parameter Space

27 Sept 2024 (modified: 20 Dec 2024)ICLR 2025 Conference Withdrawn SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: backdoor attack, backdoor defense
Abstract: Recent research on backdoor attacks mainly focuses on invisible triggers in input space and inseparable backdoor representations in feature space to increase the backdoor stealthiness against defenses. We examine common backdoor attack practices that look at input-space or feature-space stealthiness and show that state-of-the-art stealthy input-space and feature-space backdoor attacks can be easily spotted by examining the parameter space of the backdoored model. Leveraging our observations on the behavior of the defenses in the parameter space, we propose a novel clean-label backdoor attack called Grond. We present extensive experiments showing that Grond outperforms state-of-the-art backdoor attacks on CIFAR-10, GTSRB, and a subset of ImageNet. Our attack limits the parameter changes through Adversarial Backdoor Injection, adaptively increasing the parameter-space stealthiness. Finally, we show how combining Grond's Adversarial Backdoor Injection with commonly used attacks can consistently improve their effectiveness. Our code is available at \url{https://anonymous.4open.science/r/grond-557F}.
Primary Area: alignment, fairness, safety, privacy, and societal considerations
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.
Reciprocal Reviewing: I understand the reciprocal reviewing requirement as described on https://iclr.cc/Conferences/2025/CallForPapers. If none of the authors are registered as a reviewer, it may result in a desk rejection at the discretion of the program chairs. To request an exception, please complete this form at https://forms.gle/Huojr6VjkFxiQsUp6.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 9945
Loading