Depth as a Scaling Vector: Simple Pruning and Evaluation of Emergent Abilities in Pruned LLMs

Published: 24 Sept 2025, Last Modified: 24 Sept 2025NeurIPS 2025 LLM Evaluation Workshop PosterEveryoneRevisionsBibTeXCC BY 4.0
Keywords: depth pruning, emergent ability, structured pruning, large language models, LLM deployment
TL;DR: Simple depth pruning methods and a systematic evaluation of emergent abilities in pruned LLMs.
Abstract: The evolving lifecycle of large language models (LLMs) calls for effective strategies for scaling them down for deployment without sacrificing core capabilities. In this work, we investigate depth as a primary architectural scaling vector, introducing simple methods for pruning layers of LLMs, and systematically evaluate how such scaling affects the emergent abilities of LLMs. Our evaluations demonstrate that these methods offer a practical path to facilitate LLM deployment, significantly reducing computational demands while retaining the emergent abilities that make these models powerful and attractive in a wide range of applications.
Submission Number: 32
Loading