Directional Rank Reduction for Backdoor Defense

24 Sept 2023 (modified: 11 Feb 2024)Submitted to ICLR 2024EveryoneRevisionsBibTeX
Primary Area: societal considerations including fairness, safety, privacy
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Keywords: backdoor defense, backdoor attack, neuron pruning
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2024/AuthorGuide.
Abstract: Recent studies have indicated the effectiveness of neuron pruning for backdoor defense. In this work, we explore the limitations of pruning-based defense through theoretical and empirical investigations. We argue that pruning-based defense necessitates the removal of neurons that affect normal performance when the effect of backdoor is entangled across normal neurons. To address this challenge, we propose an extended neuron pruning framework, named \emph{Directional Rank Reduction (\method)}. \method consists of three procedures: orthogonal transformation, pruning, and inverse transformation. Through the transformation of the feature space prior to pruning, \method is able to focus the trigger effects on a limited number of neurons for more efficient pruning with less damage, outperforming existing pruning-based defense strategies. We implement \method using Sarle's Bimodality Coefficient (SBC) which is optimized as the criterion for the transformation matrix based on the separability assumption of benign and poisoned features. Extensive experimental results demonstrate the superiority of our method. On average, our approach substantially reduces the ASR by 4.5x and increases the ACC by 1.45\% compared with the recently strong baselines.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors' identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 9434
Loading