Efficient Sparse-Winograd Convolutional Neural Networks

Xingyu Liu; Song Han; Huizi Mao; William J. Dally

Efficient Sparse-Winograd Convolutional Neural Networks

Xingyu Liu, Song Han, Huizi Mao, William J. Dally

06 Jun 2025 (modified: 19 Feb 2017)ICLR 2017Readers: Everyone

Abstract: Convolutional Neural Networks (CNNs) are compute intensive which limits their application on mobile devices. Their energy is dominated by the number of multiplies needed to perform the convolutions. Winograd’s minimal filtering algorithm (Lavin (2015)) and network pruning (Han et al. (2015)) reduce the operation count. Unfortunately, these two methods cannot be combined—because applying theWinograd transform fills in the sparsity in both the weights and the activations. We propose two modifications to Winograd-based CNNs to enable these methods to exploit sparsity. First, we prune the weights in the ”Winograd domain” (after the transform) to exploit static weight sparsity. Second, we move the ReLU operation into the ”Winograd domain” to improve the sparsity of the transformed activations. On CIFAR-10, our method reduces the number of multiplications in the VGG-nagadomi model by 10.2x with no loss of accuracy.

TL;DR: Prune and ReLU in Winograd domain for efficient convolutional neural network

Keywords: Deep learning

Conflicts: stanford.edu, nvidia.com

4 Replies

Loading