LeanFlex-GKP: Advancing Hassle-Free Structured Pruning with Simple Flexible Group Count

Published: 28 Oct 2023, Last Modified: 01 Dec 2023WANT@NeurIPS 2023 PosterEveryoneRevisionsBibTeX
Keywords: pruning, structured pruning, structural pruning, grouped kernel pruning, one-shot, CNN
TL;DR: Pruning Grouped Kernels while remaining structured is great. But all GKP methods seem to rely on dynamic operations. We argue it is best to do it at Conv2d(groups), resulting in a method with improved performance, efficiency, and user-friendliness.
Abstract: Densely structured pruning methods — which generate pruned models in a fully dense format, allowing immediate compression benefits without additional demands — are evolving due to their practical significance. Traditional techniques in this domain mainly revolve around coarser granularities, such as filter pruning, and thereby limit performance due to a restricted pruning freedom. Recent advancements in *Grouped Kernel Pruning (GKP)* have enabled the utilization of finer granularities while maintaining a densely structured format. We observe that existing GKP methods often introduce dynamic operations to different aspects of their procedures at the cost of adding complications and/or imposing limitations (e.g. requiring an expensive mixture of clustering schemes), or contain dynamic pruning rates and sizes among groups which results in a reliance on custom architecture support for its pruned models. In this work, we argue that the best practice to introduce these dynamic operations to GKP is to make `Conv2d(groups)` (a.k.a. group count) flexible under an integral optimization, leveraging its ideal alignment with the infrastructure support *Grouped Convolution*. Pursuing such a direction, we present a one-shot, post-train, data-agnostic GKP method that is more performant, adaptive, and efficient than its predecessors while simultaneously being a lot more user-friendly, with little-to-no hyper-parameter tuning or handcrafting of criteria required.
Submission Number: 52