Pruning Networks Only Using Few-Shot Pretraining Based on Gradient Similarity Frequency

Haigen Hu; Huihuang Zhang; Qianwei Zhou; Tieming Chen

Pruning Networks Only Using Few-Shot Pretraining Based on Gradient Similarity Frequency

Haigen Hu, Huihuang Zhang, Qianwei Zhou, Tieming Chen

Published: 01 Jan 2025, Last Modified: 14 Aug 2025IEEE Trans. Artif. Intell. 2025EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Neural network pruning is a popular and promising approach aiming at reducing heavy networks to lightweight ones by removing redundancies. Most existing methods adopt a three-stage pipeline, including pretraining, pruning, and fine-tuning. However, it is time-consuming to train a large and redundant network in the pretraining process. In this work, we propose a new minimal pretraining pruning method, gradient similarity frequency-based pruning (GSFP), which prunes a given network only using few-shot pretraining before training. Instead of pretraining a fully trained over-parameterized model, our method only uses one epoch to obtain the ranked list of convolution filters to be pruned according to their gradient similarity frequency and determines the redundant convolution filters that should be removed. Then, the obtained sparse network is trained in the standard way without the need to fine-tune the inherited weights from the full model. Finally, a series of experiments are conducted to verify the effectiveness of CIFAR10/100 and ImageNet. The results show that our method can achieve remarkable results on some popular networks, such as VGG, ResNet, and DenseNet. Importantly, the proposed pruning approach never requires pretraining the over-parameterized model, thus offering a promising prospect of application and spreading for limited computational resources.

Loading