LeanFlex-GKP: Advancing Hassle-Free Structured Pruning with Simple Flexible Group Count

Jiamu Zhang; Shaochen Zhong; Andrew Ye; Zirui Liu; Kaixiong Zhou; Xia Hu; Shuai Xu; Vipin Chaudhary

LeanFlex-GKP: Advancing Hassle-Free Structured Pruning with Simple Flexible Group Count

Jiamu Zhang, Shaochen Zhong, Andrew Ye, Zirui Liu, Kaixiong Zhou, Xia Hu, Shuai Xu, Vipin Chaudhary

23 Sept 2023 (modified: 11 Feb 2024)Submitted to ICLR 2024EveryoneRevisionsBibTeX

Primary Area: general machine learning (i.e., none of the above)

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.

Keywords: pruning, structured pruning, grouped kernel pruning, CNN, one-shot

Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2024/AuthorGuide.

TL;DR: Pruning Grouped Kernels while remaining structured is great. But all GKP methods seem to rely on dynamic operations. We argue it is best to do it at Conv2d(groups), resulting in a method with improved performance, efficiency, and user-friendliness.

Abstract: Densely structured pruning methods — which generate pruned models in a fully dense format, allowing immediate compression benefits without additional demands — are evolving owing to their practical significance. Traditional techniques in this domain mainly revolve around coarser granularities, such as filter pruning, thereby limiting their performance due to restricted pruning freedom. Recent advancements in *Grouped Kernel Pruning (GKP)* have enabled the utilization of finer granularity while maintaining the densely structured format. We observed that existing GKP methods often introduce dynamic operations to different aspects of their procedures, where many were done so at the cost of adding complications and/or imposing limitations — e.g., requiring an expensive mixture of clustering schemes; or having dynamic pruning rates and sizes among groups, which lead to reliance on custom architecture support for its pruned models. In this work, we argue the best practice to introduce such dynamic operation to GKP is to make `Conv2d(groups)` (a.k.a. group count) flexible under an integral optimization, leveraging its ideal alignment with the infrastructure support of *Grouped Convolution*. Pursuing such direction, we present a one-shot, post-train, data-agnostic GKP method that is more performant, adaptive, and efficient than its predecessors; while simultaneously being a lot more user-friendly with little-to-no hyper-parameter tuning or handcrafted criteria required.

Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors' identity.

No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.

Submission Number: 7227

Loading