Keywords: Holder Pruning, Holder iteration defense, backdoor attacks, Deep Neural Networks, backdoor defense
TL;DR: Holder pruning is a computationally efficient defense against backdoor attacks that uses the Holder constant to detect and remove neurons affected by backdoor triggers.
Abstract: Deep Neural Networks (DNNs) have become the cornerstone of modern machine
learning applications, achieving impressive results in domains ranging from com-
puter vision to autonomous systems. However, their dependence on extensive data
and computational resources exposes them to vulnerabilities such as backdoor
attacks, where poisoned samples can lead to erroneous model outputs. To counter
these threats, we introduce a defense strategy called Hölder Pruning to detect
and eliminate neurons affected by triggers embedded in poisoned samples. Our
method partitions the neural network into two stages: feature extraction and feature
processing, aiming to detect and remove backdoored neurons—the highly sensitive
neurons affected by the embedded triggers—while maintaining model performance
This improves model sensitivity to perturbations and enhances pruning precision
by exploiting the unique clustering properties of poisoned samples. We use the
Hölder constant to quantify sensitivity of neurons to input perturbations and prove
that using the Fast Gradient Sign Method (FGSM) can effectively identify highly
sensitive backdoored neurons. Our extensive experiments demonstrate efficacy of
Hölder Pruning across six clean feature extractors (SimCLR, Pretrained ResNet-18,
ViT, ALIGN, CLIP, and BLIP-2) and confirm robustness against nine backdoor
attacks (BadNets, LC, SIG, LF, WaNet, Input-Aware, SSBA, Trojan, BppAttack)
using three datasets (CIFAR-10, CIFAR-100, GTSRB). We compare Hölder Pruning to eight SOTA backdoor defenses (FP, ANP, CLP, FMP, ABL, DBD, D-ST)
and show that Hölder Pruning outperforms all eight SOTA methods. Moreover,
Hölder Pruning achieves a runtime up to 1000x faster than SOTA defenses when
a clean feature extractor is available. Even when clean feature extractors are not
available, our method is up to 10x faster.
Primary Area: other topics in machine learning (i.e., none of the above)
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.
Reciprocal Reviewing: I understand the reciprocal reviewing requirement as described on https://iclr.cc/Conferences/2025/CallForPapers. If none of the authors are registered as a reviewer, it may result in a desk rejection at the discretion of the program chairs. To request an exception, please complete this form at https://forms.gle/Huojr6VjkFxiQsUp6.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 11195
Loading