Prune and Tune: Improving Efficient Pruning Techniques for Massive Language Models

Aaquib Syed; Phillip Huang Guo; Vijaykaarti Sundarapandiyan

Prune and Tune: Improving Efficient Pruning Techniques for Massive Language Models

Aaquib Syed, Phillip Huang Guo, Vijaykaarti Sundarapandiyan

01 Mar 2023 (modified: 26 May 2023)Submitted to Tiny Papers @ ICLR 2023Readers: Everyone

Keywords: Large Language Model, Pruning, Lottery Tickets, GPT, Fine-Tuning

TL;DR: Iteratively fine-tuning SparseGPT models (with relatively few steps) significantly improves their performance at high sparsity.

Abstract: Massive language models with billions of parameters have significant compute expenses and thus can benefit from pruning. Pruning techniques for massive models are typically iterative and require extensive weight retraining after pruning. SparseGPT, a recently introduced one-shot technique for pruning such models, enables pruning without retraining. We improve upon SparseGPT by fine-tuning during pruning with minimal training steps, and we perform experiments against magnitude pruning and find that our iteratively fine-tuned SparseGPT models significantly outperform their magnitude pruning counterparts at high sparsity.

4 Replies

Loading