Towards Performance-maximizing Network Pruning via Global Channel Attention

Yingchun Wang; Linchuan Xu; Song Guo; Jingcai Guo; Weizhan Zhang; Jie Zhang; shuai chen

Towards Performance-maximizing Network Pruning via Global Channel Attention

Yingchun Wang, Linchuan Xu, Song Guo, Jingcai Guo, Weizhan Zhang, Jie Zhang, shuai chen

Published: 01 Feb 2023, Last Modified: 13 Feb 2023Submitted to ICLR 2023Readers: Everyone

Keywords: Channle Pruning, Global Attention, Deep Neural Networks, Model Compression

TL;DR: GlobalPrun is a static channel pruning method which utilizes advantages from both static and dynamic methods via global channel attention, achieving much higher compression rates and better accuracy.

Abstract: Network pruning has attracted increasing attention recently for its capability of transferring large-scale neural networks (e.g., CNNs) into resource-constrained devices. Such a transfer is typically achieved by removing redundant network parameters while retaining its generalization performance in a static or dynamic pruning manner. Concretely, static pruning usually maintains a larger and fit-to-all (samples) compressed network by removing the same channels for all samples, while dynamic pruning can adaptively remove (more) different channels for different samples and obtain state-of-the-art performance along with a higher compression ratio. However, since the system has to preserve the complete network information for sample-specific pruning, dynamic pruning methods are usually not memory-efficient. In this paper, our interest is to explore a static alternative, dubbed GlobalPru, to conventional static pruning methods that can take into account both compression ratio and model performance maximization. Specifically, a novel channel attention-based learn-to-rank algorithm is proposed to learn the global channel attention of the network for various samples, wherein, each sample-specific channel saliency is forced to reach an agreement on the global ranking. Hence, all samples can empirically share the same pruning priority of channels to achieve channel pruning with minimal performance loss. Extensive experiments demonstrate that the proposed GlobalPru can achieve better performance than state-of-the-art static and dynamic pruning methods by significant margins.

Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.

No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics

Submission Guidelines: Yes

Please Choose The Closest Area That Your Submission Falls Into: Deep Learning and representational learning

5 Replies

Loading