K&L: Penetrating Backdoor Defense with Key and Locks

Zhiyu Zhu; Jiayu Zhang; Zhibo Jin; Xinyi Wang; Dong Yuan; Huaming Chen

K&L: Penetrating Backdoor Defense with Key and Locks

Zhiyu Zhu, Jiayu Zhang, Zhibo Jin, Xinyi Wang, Dong Yuan, Huaming Chen

27 Sept 2024 (modified: 05 Feb 2025)Submitted to ICLR 2025EveryoneRevisionsBibTeXCC BY 4.0

Keywords: backdoor attack, backdoor defense, AI security

Abstract: Backdoor attacks in machine learning create hidden vulnerability by manipulating the model behaviour with specific triggers. Such attacks often remain unnoticed as the model operates as expected for normal input. Thus, it is imperative to understand the intricate mechanism of backdoor attacks. To address this challenge, in this work, we introduce three key requirements that a backdoor attack must meet. Moreover, we note that current backdoor attack algorithms, whether employing fixed or input-dependent triggers, exhibit a high binding with model parameters, rendering them easier to defend against. To tackle this issue, we propose the Key-Locks algorithm, which separates the backdoor attack process into embedding locks and employing a key for unlocking. This method enables the adjustment of unlocking levels to counteract diverse defense mechanisms. Extensive experiments are conducted to evaluate the effective of our proposed algorithm. Our code is available at: https://anonymous.4open.science/r/KeyLocks-FD85

Primary Area: alignment, fairness, safety, privacy, and societal considerations

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.

Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.

Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.

No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.

Submission Number: 9344

Loading