Winograd Structured Pruning for Fast Winograd Convolution

Cheonjun Park; Hyun Jae Oh; Mincheol Park; Myung Kuk Yoon; Minsik Kim; Suhyun Kim; Won Woo Ro

Winograd Structured Pruning for Fast Winograd Convolution

Cheonjun Park, Hyun Jae Oh, Mincheol Park, Myung Kuk Yoon, Minsik Kim, Suhyun Kim, Won Woo Ro

22 Sept 2022 (modified: 13 Feb 2023)ICLR 2023 Conference Withdrawn SubmissionReaders: Everyone

Keywords: Winograd convolution, structured pruning, GPU, parallel processor

TL;DR: We propose a novel Winograd structured pruning method, which prunes the weights in the Winograd-domain in a structured form with optimized pruning unit size for fast Winograd convolution on parallel processors.

Abstract: Convolutional Neural Networks (CNNs) are computationally intensive, which limits deployment into mobile devices. To minimize operation counts in CNNs, pruning optimization techniques and Winograd’s minimal filtering algorithm are widely used; however, the benefit of pruning disappears when both optimizations are simply applied together in CNN. To take full advantage of both approaches, two previous pruning methods were proposed: one is to apply pruning after kernel transformation, and the other is applying filter pruning on Winograd convolution. Unfortunately, the first method is hardware-unfriendly and the second approach suffers from a significant loss of accuracy. Thus, we propose structured pruning method specialized for Winograd convolution, that maximizes the hardware utilization by considering the conversion algorithm of parallel processors. We analyze the conversion algorithm of Winograd convolution on parallel processing units; then, we prune the weights in the Winograd-domain in a structured form with optimized pruning unit size, which maximizes the parallelism of the hardware while minimizing the loss of accuracy. For VGG-16 on the ImageNet dataset, the inference time of our method is $1.84$ and $2.89$ times better than previous two pruning methods with less than $1\%$ accuracy loss.

Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.

No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics

Submission Guidelines: Yes

Please Choose The Closest Area That Your Submission Falls Into: Infrastructure (eg, datasets, competitions, implementations, libraries)

5 Replies

Loading