Global Channel Pruning With Self-Supervised Mask Learning

Ming Ma; Tongzhou Zhang; Ziming Wang; Yue Wang; Taoli Du; Wenhui Li

Global Channel Pruning With Self-Supervised Mask Learning

Ming Ma, Tongzhou Zhang, Ziming Wang, Yue Wang, Taoli Du, Wenhui Li

Published: 01 Jan 2025, Last Modified: 05 Aug 2025IEEE Trans. Circuits Syst. Video Technol. 2025EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Network pruning is widely used in model compression due to its simplicity and efficiency. Existing methods typically introduce sparse loss regularization to learn masks. However, this sparse regularization approach lacks a clear criterion for evaluating channel importance and relies on manually defined rules, leading to a decline in model performance. In this article, a Self-Supervised Mask Learning (SSML) method for global channel pruning is proposed, casting mask learning as a self-supervised binary classification task to automatically identify less important channels. Specifically, a dedicated pretext task is designed for the channelwise masks, which leverages the original network to generate pseudo-labels from the mask itself to guide mask learning. Then, a polarization mask loss function is proposed, transforming the discrete mask learning problem into a differentiable binary classification problem. The proposed loss function distinguishes the similarity between pseudo-labels and masks, clustering similar masks together in the feature space and separating dissimilar masks, ultimately allowing channels with masks of 0 to be safely removed without damaging the performance of the pruned model. In addition, SSML can train from scratch to yield a compact model. Extensive experiments on CIFAR-10, CIFAR-100 and ImageNet datasets demonstrate that SSML outperforms state-of-the-art methods. For instance, SSML prunes 52.7% of the FLOPs of ResNe34 on the ImageNet dataset with only 0.01% drop in Top-1 accuracy. Moreover, the generalization of SSML is verified on downstream tasks.

Loading