Keywords: Pruning, Supermasking, Gradient-free training, Sparse networks
Abstract: Pruning overparameterized neural networks to obtain memory-and-compute-efficient sparse networks is an active area of research. Recent works attempt to prune neural networks at initialization to design sparse networks that can be trained efficiently. In this paper we propose One-Shot Supermasking (OSSuM), a gradient-free, compute-efficient technique to efficiently prune neurons in fully-connected networks. In theory we frame this problem as a neuron subset selection problem, wherein we prune neurons to obtain a better accuracy by optimizing on the cross-entropy loss. In our experiments we show that OSSuM can perform similar to gradient-based pruning techniques at initialization, prior to training. For example, OSSuM can achieve a test set accuracy of $82.4\%$ on MNIST by pruning a 2-layer fully-connected neural network at initialization with just a single forward-pass over the training data. Further, we empirically demonstrate that OSSuM can be used to efficiently prune trained networks as well. We also propose various variants of OSSuM that can be used to prune deeper neural networks.
One-sentence Summary: We propose OSSuM, a gradient-free, one-shot technique to efficiently prune neurons in fully-connected networks. Our algorithm can improve the classification performance of randomly initialized networks just by one-shot pruning.
Supplementary Material: zip
11 Replies
Loading