Abstract: Semi-structured (2:4) sparsity is a widely adopted pruning method in modern hardware and software ecosystems (e.g., NVIDIA Sparse Tensor Cores and PyTorch), achieving up to 2X faster inference and reduced memory footprint with negligible accuracy loss. It removes two out of every four contiguous weights, using permutations to ensure the largest-magnitude weights are retained. In this work, we show that this predictable mechanism can be exploited to design Silent Until Sparse (SUS), a novel compression-activated backdoor attack tailored to the 2:4 sparsity regime. SUS employs a two-phase training procedure that modifies (i) the weights that will be retained after pruning to embed the backdoor, and (ii) the weights that will be pruned to hide it in the dense model. SUS also provides formal guarantees that the attack will be successfully activated after sparsification. Experiments show that SUS is largely effective against semi-structured sparsification across both hardware-accelerated and software pipelines, outperforming existing compression-aware backdoor attacks, bypassing standard defenses, and even being robust to user-side fine-tuning.
External IDs:dblp:journals/corr/abs-2509-08747
Loading