Keywords: structured pruning, neural network, model compression
TL;DR: Structured Pruning via Ranking (SPvR) is an efficient model pruning approach to reduce both depth and width of neural networks while maintaining high accuracy and low inference latency across various datasets and architectures
Abstract: Deep neural networks have achieved state-of-the-art performance in multiple domains but are increasingly resource-intensive, limiting their deployment on constrained devices. We introduce Structured Pruning via Ranking (SPvR), a novel structured pruning approach to address this challenge for classification tasks. SPvR prunes pre-trained networks in terms of function composition and network width while adhering to a user-specified parameter budget. Our method leverages local grouping and global ranking modules to generate smaller yet effective networks tailored to a given dataset and model. Finally, we train the pruned networks from scratch, instead of fine-tuning. Our evaluations demonstrate that SPvR significantly surpasses existing state-of-the-art pruning methods on benchmark datasets, using standard architectures. Even with a $90$% reduction in size, SPvR's sub-networks experience a minimal drop in test accuracy $(<1$%$)$ while on ImageNet1K, we outperform all baselines by achieving $<1$% Top-5 accuracy drop when pruning $70$% of ResNet50 parameters. Additionally, when compared to MobileNetV3, an SPvR pruned network improves the Top-1 accuracy by $3.3$% with $20$% less parameters. Furthermore, we empirically show that SPvR achieves reduced inference latency, underscoring its practical benefits for deploying neural networks on resource-constrained devices.
Latex Source Code: zip
Signed PMLR Licence Agreement: pdf
Readers: auai.org/UAI/2025/Conference, auai.org/UAI/2025/Conference/Area_Chairs, auai.org/UAI/2025/Conference/Reviewers, auai.org/UAI/2025/Conference/Submission742/Authors, auai.org/UAI/2025/Conference/Submission742/Reproducibility_Reviewers
Submission Number: 742
Loading