Examining the Value of Neural Filter Pruning -- Retrospect and ProspectDownload PDF

22 Sept 2022 (modified: 13 Feb 2023)ICLR 2023 Conference Withdrawn SubmissionReaders: Everyone
Keywords: Neural network filter pruning, value of pruning, trainability, dynamical isometry
Abstract: Neural network filter pruning is one of the major methods in model compression and acceleration. Despite the remarkable progress in the past several years, there is an ongoing debate concerning the value of filter pruning -- Some works in 2019 argue that filter pruning is of no value since they found training the pruned network from scratch can achieve similar or even better performance than pruning a pretrained model. This argument fundamentally challenges the value of many filter pruning works. However, to date, the community has not formally responded to such acute questioning. In this paper, we present extensive empirical analyses to show the seeming contradiction is due to suboptimal learning rate schedule settings. We introduce more strict comparison setups and show filter pruning still has value within the same training epoch budgets. Apart from justifying the value of filter pruning empirically, we further examine the reason behind it and discover that the poor trainability caused by pruning is largely responsible for the sub-optimality of the learning rate schedule, thus calling for an urgent need to recover trainability after pruning. This paper does not target new SOTA performance of filter pruning. Instead, we focus on clarifying the existing mysteries in filter pruning towards a better understanding.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics
Submission Guidelines: Yes
Please Choose The Closest Area That Your Submission Falls Into: Deep Learning and representational learning
TL;DR: We study the "value of filter pruning" issue and show it might be inaccurate due to suboptimal LR setups, more insights provided to explain the reason behind.
6 Replies

Loading