BilevelPruning: Unified Dynamic and Static Channel Pruning for Convolutional Neural Networks

Published: 2024, Last Modified: 19 Oct 2024CVPR 2024EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Most existing dynamic or runtime channel pruning meth-ods have to store all weights to achieve efficient inference, which brings extra storage costs. Static pruning methods can reduce storage costs directly, but their performance is limited by using a fixed sub-network to approximate the orig-inal model. Most existing pruning works suffer from these drawbacks because they were designed to only conduct ei-ther static or dynamic pruning. In this paper, we propose a novel method to solve both efficiency and storage challenges via simultaneously conducting dynamic and static channel pruning for convolutional neural networks. We propose a new bi-level optimization based model to naturally integrate the static and dynamic channel pruning. By doing so, our method enjoys benefits from both sides, and the disadvan-tages of dynamic and static pruning are reduced. After pruning, we permanently remove redundant parameters and then finetune the model with dynamic flexibility. Experimental results on CIFAR-10 and ImageNet datasets suggest that our method can achieve state-of-the-art performance compared to existing dynamic and static channel pruning methods.
Loading