Efficient Sparse-Winograd Convolutional Neural Networks

Xingyu Liu, Song Han, Huizi Mao, William J. Dally

Feb 17, 2017 (modified: Feb 19, 2017) ICLR 2017 workshop submission readers: everyone
  • Abstract: Convolutional Neural Networks (CNNs) are compute intensive which limits their application on mobile devices. Their energy is dominated by the number of multiplies needed to perform the convolutions. Winograd’s minimal filtering algorithm (Lavin (2015)) and network pruning (Han et al. (2015)) reduce the operation count. Unfortunately, these two methods cannot be combined—because applying theWinograd transform fills in the sparsity in both the weights and the activations. We propose two modifications to Winograd-based CNNs to enable these methods to exploit sparsity. First, we prune the weights in the ”Winograd domain” (after the transform) to exploit static weight sparsity. Second, we move the ReLU operation into the ”Winograd domain” to improve the sparsity of the transformed activations. On CIFAR-10, our method reduces the number of multiplications in the VGG-nagadomi model by 10.2x with no loss of accuracy.
  • TL;DR: Prune and ReLU in Winograd domain for efficient convolutional neural network
  • Keywords: Deep learning
  • Conflicts: stanford.edu, nvidia.com
  • Authorids: xyl@stanford.edu, songhan@stanford.edu, huizi@stanford.edu, dally@stanford.edu