The Power of Sparsity in Convolutional Neural Networks

Soravit Changpinyo; Mark Sandler; Andrey Zhmoginov

The Power of Sparsity in Convolutional Neural Networks

Soravit Changpinyo, Mark Sandler, Andrey Zhmoginov

01 Jul 2025 (modified: 21 Jul 2022)Submitted to ICLR 2017Readers: Everyone

Abstract: Deep convolutional networks are well-known for their high computational and memory demands. Given limited resources, how does one design a network that balances its size, training time, and prediction accuracy? A surprisingly effective approach to trade accuracy for size and speed is to simply reduce the number of channels in each convolutional layer by a fixed fraction and retrain the network. In many cases this leads to significantly smaller networks with only minimal changes to accuracy. In this paper, we take a step further by empirically examining a strategy for deactivating connections between filters in convolutional layers in a way that allows us to harvest savings both in run-time and memory for many network architectures. More specifically, we generalize 2D convolution to use a channel-wise sparse connection structure and show that this leads to significantly better results than the baseline approach for large networks including VGG and Inception V3.

TL;DR: Sparse random connections that allow savings to be harvested and that are very effective at compressing CNNs.

Conflicts: usc.edu, google.com

Keywords: Deep learning, Supervised Learning

15 Replies

Loading