Back to Basics: Efficient Network Compression via IMPDownload PDF

Published: 28 Jan 2022, Last Modified: 22 Oct 2023ICLR 2022 SubmittedReaders: Everyone
Keywords: Magnitude pruning, Sparsity, IMP, Model Compression
Abstract: Network pruning is a widely used technique for effectively compressing Deep Neural Networks with little to no degradation in performance during inference. Iterative Magnitude Pruning (IMP) (Han et al., 2015) is one of the most established approaches for network pruning, consisting of several iterative training and pruning steps, where a significant amount of the network’s performance is lost after pruning and then recovered in the subsequent retraining phase. While commonly used as a benchmark reference, it is often argued that a) it reaches suboptimal states by not incorporating sparsification into the training phase, b) its global selection criterion fails to properly determine optimal layer-wise pruning rates and c) its iterative nature makes it slow and non-competitive. In light of recently proposed retraining techniques, we investigate these claims through rigorous and consistent experiments where we compare IMP to pruning-during-training algorithms, evaluate proposed modifications of its selection criterion and study the number of iterations and total training time actually required. We find that IMP with SLR (Le & Hua, 2021) for retraining can outperform state-of-the-art pruning-during-training approaches without or with only little computational overhead, that the global magnitude selection criterion is largely competitive with more complex approaches and that only few retraining epochs are needed in practice to achieve most of the sparsity-vs.-performance trade-off of IMP. Our goals are both to demonstrate that basic IMP can already provide state-of-the-art pruning results on par or outperforming more complex or heavily parameterized approaches and also to establish a more realistic yet easily realisable baseline for future research.
One-sentence Summary: We demonstrate that basic IMP can already provide state-of-the-art pruning results on par or outperforming more complex or heavily parameterized approaches and aim to establish a more realistic yet easily realizable baseline for future research.
Supplementary Material: zip
Community Implementations: [![CatalyzeX](/images/catalyzex_icon.svg) 7 code implementations](https://www.catalyzex.com/paper/arxiv:2111.00843/code)
14 Replies

Loading