When Less Is More: Uncovering the Robustness Advantage of Model Pruning

ICLR 2026 Conference Submission19023 Authors

19 Sept 2025 (modified: 08 Oct 2025)ICLR 2026 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: adversarial robustness, pruning
TL;DR: We theoretically analyze how pruning affects adversarial robustness, revealing a trade-off with accuracy and scenarios where pruning yields both robustness and compression.
Abstract: The interplay between neural network pruning, a widely adopted approach for model compression, and adversarial robustness has garnered increasing attention. However, most existing work focuses on empirical findings, with limited theoretical grounding. In this paper, we address this gap by providing a theoretical analysis of how pruning influences adversarial robustness. We first show that the pruning strategy and associated parameters play a critical role in determining the robustness of the resulting pruned model. We then examine how these choices affect the optimality of pruning in terms of maintaining performance relative to the original model. Building on these results, we formalize the inherent trade-off between clean accuracy and adversarial robustness introduced by pruning, emphasizing the importance of balancing these competing objectives. Finally, we empirically validate our theoretical insights on different models and datasets, reinforcing our novel understanding of the adversarial implications of pruning. Our findings offer a principled foundation for designing pruning strategies that not only achieve model compression but also enhance robustness without additional constraints or cost, yielding a ``free-lunch'' benefit.
Supplementary Material: zip
Primary Area: alignment, fairness, safety, privacy, and societal considerations
Submission Number: 19023
Loading