Accelerating convolutional neural networks by group-wise 2D-filter pruning

Niange Yu, Shi Qiu, Xiaolin Hu, Jianmin Li

2017 (modified: 02 Nov 2022)IJCNN 2017Readers: Everyone

Abstract: Network pruning is an effective way to accelerate Convolutional Neural Networks (CNNs). In recent years, structured pruning methods are proposed in favor of unstructured methods as they have shown greater speedup in practical use. Existing structured methods does pruning along two main dimensions: 3D-filter wise, i.e., remove a 3D-fllter as a whole, and filter-shape wise, i.e., remove a same position from all 3D-filters. In this work, we propose a new group-wise 2D-fllter pruning approach that is orthogonal and complementary to the existing methods. The proposed approach removes a portion of 2D-fllters from each 3D-filter according to the pruning patterns learned from the data, and leads to compressed models that do not require sophisticated implementation of convolution operations. A fine-tuning process is followed to recover the accuracy. The knowledge distillation (KD) framework is explored in the fine-tuning process to improve the performance. We present our method for learning the pruning pattens as well as the fine-tuning strategy based on knowledge distillation. The proposed approach is validated on two representative CNN models - ZF and VGG16, pre-trained on ILSVRC12. Experimental results demonstrate the effectiveness of our approach. In VGG16, we get even higher accuracy after speeding-up the network by 4 times.

0 Replies