Performance analysis of deep neural network based  on channel pruning

Junfeng Chen

Performance analysis of deep neural network based on channel pruning

Junfeng Chen

14 Aug 2024 (modified: 27 Sept 2024)IEEE ICIST 2024 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0

TL;DR: This paper discusses the optimization of deep neural network performance through pruning techniques, analyzing the impact of pruning on model accuracy and model size.

Abstract: Model compression technology, a crucial aspect of neural network models, offers a range of benefits. It reduces the number of parameters and computational load, thereby shrinking the model size, enhancing inference speed, decreasing memory usage, and saving power. This article delves into the research of model compression technology for neural network models, focusing on channel pruning algorithms and model compression methods based on the Batch Normalization (BN) layer. The goal is to reduce the number of model parameters and computational load, leading to a smaller model size, faster inference speed, reduced memory usage, and saved power. The article applies sparse regularization to the scaling factors of the BN layer, serving as the basis for determining channel importance and reducing model complexity. It then presents experimental comparisons on VGGNet-16, ResNet-164, and DenseNet-40 neural network models, including standard training, sparse regularization, and pruning fine-tuning training results. The experiments reveal that the pruned networks achieve comparable or even higher accuracy than the original networks, underscoring the importance of the research in model compression technology

Submission Number: 128

Loading