MI-PRUN : Leveraging Mutual Information for Efficient LLM Pruning

MI-PRUN : Leveraging Mutual Information for Efficient LLM Pruning

ACL ARR 2025 February Submission461 Authors

08 Feb 2025 (modified: 09 May 2025)ACL ARR 2025 February SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Abstract: Large language models have become crucial across various domains, yet it comes at the expense of considerable computational and memory resources. Model pruning refines deep learning models by excising redundant elements. However, current pruning methods often fail to substantially achieve end-to-end acceleration. In this paper, we present MI-PRUN, a novel approach that uses mutual information to identify low-impact blocks for efficient model pruning. Furthermore, we incorporate the Data Processing Inequality to ensure the preservation of contiguous blocks essential for overall model performance, avoiding their accidental pruning. Concurrently, we develop the Fast-Block-Select algorithm to enhance the efficiency of the pruning process. Comprehensive experiments show that our proposed method surpasses the previous state-of-the-art (SOTA) model pruning methods.

Paper Type: Long

Research Area: Efficient/Low-Resource Methods for NLP

Research Area Keywords: pruning

Contribution Types: Approaches to low-resource settings

Languages Studied: English

Submission Number: 461

Loading