Keywords: Pruning Neural Network, Sparsity
Abstract: Pruning at initialization (PaI) methods aim to remove weights of neural networks before training in pursuit of reducing training costs. While current PaI methods are promising and outperform random pruning, much work remains to be done to understand and improve PaI methods to achieve the performance of pruning after training. In particular, recent studies (Frankle et al., 2021; Su et al., 2020) present empirical evidence for the potential of PaI, and show intriguing properties like layerwise random shuffling connections of pruned networks preserves or even improves the performance. Our paper gives new perspectives on PaI from the geometry of subnetwork configurations. We propose to use two quantities to probe the shape of subnetworks: the numbers of effective paths and effective nodes (or channels). Using these numbers, we provide a principled framework to better understand PaI methods. Our main findings are: (i) the width of subnetworks matters in regular sparsity levels (< 99%) - this matches the competitive performance of shuffled layerwise subnetworks; (ii) node-path balancing plays a critical role in the quality of PaI subnetworks, especially in extreme sparsity regimes. These innovate an important direction to network pruning that takes into account the subnetwork topology itself. To illustrate the promise of this direction, we present a fairly naive method based on SynFlow (Tanaka et al., 2020) and conduct extensive experiments on different architectures and datasets to demonstrate its effectiveness.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics
Submission Guidelines: Yes
Please Choose The Closest Area That Your Submission Falls Into: Deep Learning and representational learning
TL;DR: Our paper gives new perspectives on pruning at initialization from the configuration of subnetwork (particularly effective paths and nodes) to better design pruning algorithms.
13 Replies
Loading