A Novel Clustering-Based Filter Pruning Method for Efficient Deep Neural Networks

Xiaohui Wei, Xiaoxian Shen, Changbao Zhou, Hengshan Yue

Published: 01 Jan 2020, Last Modified: 24 May 2024ICA3PP (2) 2020EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Deep neural networks have achieved great success in various applications, accompanied by a significant increase in the computational operations and storage costs. It is difficult to deploy this model on embedded systems. Therefore, model compress is a popular solution to reduce the above overheads. In this paper, a new filter pruning method based on the clustering algorithm is proposed to compress network models. First, we perform clustering with features of filters and select one for each category as a representative. Next, we rank all filters according to their impacts on the result to select configurable amounts of top features. Finally, we prune the redundant connections that are not selected. We empirically demonstrate the effectiveness of our approach with several network models, including VGG and ResNet. Experimental results show that on CIFAR-10, our method reduces inference costs for VGG-16 by up to 44% and ResNet-32 by up to 50%, while the accuracy can regain close to the original level.