Rethinking the Value of Training-Free Structured Pruning of LLMs

TMLR Paper3974 Authors

15 Jan 2025 (modified: 25 Mar 2025)Decision pending for TMLREveryoneRevisionsBibTeXCC BY 4.0
Abstract: This paper investigates the effectiveness of training-free structured pruning techniques for Large Language Models (LLMs), with a particular focus on depth and width pruning strategies. Through an extensive empirical evaluation across a diverse range of tasks, datasets and modalities, we reveal critical limitations in current pruning methods. While some tasks exhibit minimal performance degradation, others face significant deterioration, even at low pruning rates, contradicting prior findings that often rely on selective benchmarks. Our analysis also finds that depth pruning, despite its simplicity, usually outperforms the more granular width pruning approaches in maintaining downstream task performance. Our findings highlight that existing evaluations of pruned LLMs often overstate their effectiveness due to incomplete or limited evaluation tasks, necessitating a critical reassessment of the true value of pruning and emphasizing the need to explore more robust pruning algorithms.
Submission Length: Regular submission (no more than 12 pages of main content)
Previous TMLR Submission Url: https://openreview.net/forum?id=ypLpXQavDK
Changes Since Last Submission: - Added section 4.11 (Unstructured Pruning) - Updated section 6.5 and 3.2 (Comparison with LLMPruner) - Updated section 4.3 (Added relevant citation)
Assigned Action Editor: ~Aaron_Klein1
Submission Number: 3974
Loading

OpenReview is a long-term project to advance science through improved peer review with legal nonprofit status. We gratefully acknowledge the support of the OpenReview Sponsors. © 2025 OpenReview