Channel Gating Neural NetworksDownload PDF

Weizhe Hua, Yuan Zhou, Chris De Sa, Zhiru Zhang, G. Edward Suh

06 Sept 2019 (modified: 05 May 2023)NeurIPS 2019Readers: Everyone
Abstract: This paper introduces channel gating, a dynamic, fine-grained, and highly hardware-efficient pruning scheme to reduce the compute cost for convolutional neural networks (CNNs). Channel gating identifies regions in the features that contribute less to the classification result, and skips the computation on a subset of the input channels for these ineffective regions. Unlike static network pruning, channel gating optimizes CNN inference at run-time by exploiting input-specific characteristics, which allows substantially reducing the compute cost with almost no accuracy loss. We experimentally show that applying channel gating in state-of-the-art networks achieves a 2.7-8.0x reduction in FLOPs with minimal accuracy loss on CIFAR-10. Combining our method with knowledge distillation reduces the compute cost of ResNet-18 by 2.6x without accuracy degradation on ImageNet. We further demonstrate that channel gating can be realized in hardware in an efficient manner. Our approach exhibits sparsity patterns that are well-suited to dense systolic arrays with minimal additional hardware. We have designed an accelerator for channel gating networks, which can be implemented using either FPGAs or ASICs. Running a quantized ResNet-18 model for ImageNet, our accelerator achieves an encouraging speedup of 2.4x on average, with a theoretical FLOP reduction of 2.8x.
Code Link: https://drive.google.com/drive/folders/16Rkbq-6dS8Yw8wsN_3ghpwpZwYcc7J-c?usp=sharing
CMT Num: 1093
0 Replies

Loading