Dynamic Probabilistic Pruning: Training sparse networks based on stochastic and dynamic maskingDownload PDF

28 Sept 2020 (modified: 05 May 2023)ICLR 2021 Conference Withdrawn SubmissionReaders: Everyone
Keywords: deep probabilistic subsampling, sparse deep learning, structured pruning, hardware-oriented pruning
Abstract: Deep Learning (DL) models are known to be heavily over-parametrized, resulting in a large memory footprint and power consumption. This hampers the use of such models in hardware-constrained edge technologies such as wearables and mobile devices. Model compression during training can be achieved by promoting sparse network structures both through weight regularization and by leveraging dynamic pruning methods. State-of-the-art pruning methods are however mostly magnitude-based which impedes their use in e.g. binary settings. Importantly, most of the pruning methods do not provide a structural sparsity, resulting in an inefficient memory allocation and access for hardware implementations. In this paper, we propose a novel dynamic pruning solution that we term Dynamic Probabilistic Pruning (DPP). DPP leverages Gumbel top-K sampling to select subsets of weights during training, which enables exploring which weights are most relevant. Our approach allows for setting an explicit per-neuron layer-wise sparsity level and structural pruning across weights and feature maps, without relying on weight magnitude heuristics. Relevantly, our method generates a hardware-oriented structural sparsity for fully-connected and convolutional layers that facilitates memory allocation and access, in contrast with conventional unstructured pruning. We show that DPP achieves competitive sparsity levels and classification accuracy on MNIST and CIFAR-10, CIFAR-100 datasets compared to a state-of-the-art baseline for various DL architectures, while respecting per-neuron sparsity constraints.
One-sentence Summary: Hardware-oriented dynamic probabilistic pruning method that learns to generate structured sparsity for fully-connected and convolutional layers
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics
Reviewed Version (pdf): https://openreview.net/references/pdf?id=3dDyG9oCyr
15 Replies

Loading