The Surprising Effectiveness of Randomness in LLM Pruning

Shuyao Xu; Liu Jiayao; Zhenfeng He; Cheng Peng; Weidi Xu

The Surprising Effectiveness of Randomness in LLM Pruning

Shuyao Xu, Liu Jiayao, Zhenfeng He, Cheng Peng, Weidi Xu

Published: 05 Mar 2025, Last Modified: 03 Apr 2025SLLMEveryoneRevisionsBibTeXCC BY 4.0

Track: tiny / short paper (up to 4 pages)

Keywords: Structured Pruning, Parameter Sparsity, Random Pruning, Model Pruning, Model Compression, Large Language Models

TL;DR: Randomly pruning neurons in LLMs works surprisingly well, especially at lower pruning ratios, and can be combined with activation-based pruning for efficient and competitive results.

Abstract: This paper investigates the structured pruning of large language models (LLMs). We find that random pruning, despite its simplicity, is a surprisingly effective baseline, particularly at lower pruning ratios. We further propose a simple and efficient method that combines randomness with existing pruning heuristics. Specifically, our method combines random neuron clustering with activation magnitude pruning, exhibiting performance comparable to gradient-based methods while being significantly more efficient (up to 50x faster). Our code is available at https://github.com/Tim-Siu/llm-random-prune.

Anonymization: This submission has been anonymized for double-blind review via the removal of identifying information such as names, affiliations, and identifying URLs.

Submission Number: 38

Loading