Abstract: Traditional channel-wise pruning methods by reducing network
channels struggle to effectively prune efficient CNN
models with depth-wise convolutional layers and certain efficient
modules, such as popular inverted residual blocks.
Prior depth pruning methods by reducing network depths
are not suitable for pruning some efficient models due to
the existence of some normalization layers. Moreover, finetuning
subnet by directly removing activation layers would
corrupt the original model weights, hindering the pruned
model from achieving high performance. To address these
issues, we propose a novel depth pruning method for efficient
models. Our approach proposes a novel block pruning
strategy and progressive training method for the subnet.
Additionally, we extend our pruning method to vision
transformer models. Experimental results demonstrate that
our method consistently outperforms existing depth pruning
methods across various pruning configurations. We obtained
three pruned ConvNeXtV1 models with our method applying
on ConvNeXtV1, which surpass most SOTA efficient models
with comparable inference performance. Our method also
achieves state-of-the-art pruning performance on the vision
transformer model.
Loading