Keywords: Winograd convolution, structured pruning, GPU, parallel processor
TL;DR: We propose a novel Winograd structured pruning method, which prunes the weights in the Winograd-domain in a structured form with optimized pruning unit size for fast Winograd convolution on parallel processors.
Abstract: Convolutional Neural Networks (CNNs) are computationally intensive, which limits deployment into mobile devices.
To minimize operation counts in CNNs, pruning optimization techniques and Winograd’s minimal filtering algorithm are widely used; however, the benefit of pruning disappears when both optimizations are simply applied together in CNN.
To take full advantage of both approaches, two previous pruning methods were proposed: one is to apply pruning after kernel transformation, and the other is applying filter pruning on Winograd convolution.
Unfortunately, the first method is hardware-unfriendly and the second approach suffers from a significant loss of accuracy.
Thus, we propose structured pruning method specialized for Winograd convolution, that maximizes the hardware utilization by considering the conversion algorithm of parallel processors.
We analyze the conversion algorithm of Winograd convolution on parallel processing units; then, we prune the weights in the Winograd-domain in a structured form with optimized pruning unit size, which maximizes the parallelism of the hardware while minimizing the loss of accuracy.
For VGG-16 on the ImageNet dataset, the inference time of our method is $1.84$ and $2.89$ times better than previous two pruning methods with less than $1\%$ accuracy loss.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics
Submission Guidelines: Yes
Please Choose The Closest Area That Your Submission Falls Into: Infrastructure (eg, datasets, competitions, implementations, libraries)
5 Replies
Loading