Keywords: Structured Pruning, Network Compression, Neural Architecture Search, Hessian Approximation, Saliency-based Pruning, Deep Learning, Computer Vision
Abstract: Pruning neural networks reduces inference time and memory cost, as well as accelerates training when done at initialization. On standard hardware, these benefits will be especially prominent if coarse-grained structures, like feature maps, are pruned. We devise global saliency-based methods for second-order structured pruning (SOSP) which include correlations among structures, whereas highest efficiency is achieved by saliency approximations using fast Hessian-vector products. We achieve state-of-the-art results for various object classification benchmarks, especially for large pruning rates highly relevant for resource-constrained applications. We showcase that our approach scales to large-scale vision tasks, even though it captures correlations across all layers of the network. Further, we highlight two outstanding features of our methods. First, to reduce training costs our pruning objectives can also be applied at initialization with no or only minor degradation in accuracy compared to pruning after pre-training. Second, our structured pruning methods allow to reveal architectural bottlenecks, which we remove to further increase the accuracy of the networks.
Code Of Conduct: I certify that all co-authors of this work have read and commit to adhering to the NeurIPS Statement on Ethics, Fairness, Inclusivity, and Code of Conduct.
TL;DR: We introduce a second-order structured pruning method which efficiently captures global correlations among structures of deep neural networks.
Supplementary Material: zip
Community Implementations: [![CatalyzeX](/images/catalyzex_icon.svg) 1 code implementation](https://www.catalyzex.com/paper/sosp-efficiently-capturing-global/code)
18 Replies
Loading