Keywords: Large Language Model, Pruning, Lottery Tickets, GPT, Fine-Tuning
TL;DR: Iteratively fine-tuning SparseGPT models (with relatively few steps) significantly improves their performance at high sparsity.
Abstract: Massive language models with billions of parameters have significant compute expenses and thus can benefit from pruning. Pruning techniques for massive models are typically iterative and require extensive weight retraining after pruning. SparseGPT, a recently introduced one-shot technique for pruning such models, enables pruning without retraining. We improve upon SparseGPT by fine-tuning during pruning with minimal training steps, and we perform experiments against magnitude pruning and find that our iteratively fine-tuned SparseGPT models significantly outperform their magnitude pruning counterparts at high sparsity.
4 Replies
Loading