Open Peer Review. Open Publishing. Open Access. Open Discussion. Open Directory. Open Recommendations. Open API. Open Source.
The Power of Sparsity in Convolutional Neural Networks
Soravit Changpinyo, Mark Sandler, Andrey Zhmoginov
Nov 04, 2016 (modified: Dec 03, 2016)ICLR 2017 conference submissionreaders: everyone
Abstract:Deep convolutional networks are well-known for their high computational and memory demands. Given limited resources, how does one design a network that balances its size, training time, and prediction accuracy? A surprisingly effective approach to trade accuracy for size and speed is to simply reduce the number of channels in each convolutional layer by a fixed fraction and retrain the network. In many cases this leads to significantly smaller networks with only minimal changes to accuracy. In this paper, we take a step further by empirically examining a strategy for deactivating connections between filters in convolutional layers in a way that allows us to harvest savings both in run-time and memory for many network architectures. More specifically, we generalize 2D convolution to use a channel-wise sparse connection structure and show that this leads to significantly better results than the baseline approach for large networks including VGG and Inception V3.
TL;DR:Sparse random connections that allow savings to be harvested and that are very effective at compressing CNNs.
Keywords:Deep learning, Supervised Learning
Enter your feedback below and we'll get back to you as soon as possible.