Multi-Objective One-Shot Pruning for Large Language Models

Weiyu Chen; Hansi Yang; Yunhao GOU; Han Shi; En-Liang Hu; Zhenguo Li; James Kwok

Multi-Objective One-Shot Pruning for Large Language Models

Weiyu Chen, Hansi Yang, Yunhao GOU, Han Shi, En-Liang Hu, Zhenguo Li, James Kwok

Published: 18 Sept 2025, Last Modified: 29 Oct 2025NeurIPS 2025 posterEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Multi-Objective Optimization, Pruning, Large Language Models

Abstract: Large Language Models (LLMs) have demonstrated remarkable capabilities across various tasks but require substantial computational resources, limiting their deployment in resource-constrained environments. While one-shot pruning methods can reduce model size without expensive retraining, they typically optimize for single objectives, ignoring LLMs' multi-faceted applications. We introduce Multi-Objective One-Shot Pruning (MOSP), which formulates LLM pruning as a multi-objective optimization problem. MOSP efficiently generates a Pareto set of pruned models representing different capability trade-offs, allowing users to select solutions aligned with their preferences. The proposed approach identifies share core support while enabling specialized support. Experiments across various LLMs and sparsity levels demonstrate MOSP's superior performance in navigating multi-objective trade-offs compared to baseline methods.

Primary Area: Deep learning (e.g., architectures, generative models, optimization for deep networks, foundation models, LLMs)

Submission Number: 15825

Loading