Keywords: Model Pruning
TL;DR: Instead of prescribe how to score and prune the parameters, this paper supplies a universal adaptive threshold that can be paired with any current pruning criterion supported by a theoretical guaranteed lower bound of the retained mass.
Abstract: We introduce Effective Model Pruning (EMP), a context-agnostic, parameter-free rule addressing a fundamental question about pruning: how many entries to keep. EMP does not prescribe how to score the parameters or prune the models; instead, it supplies a universal adaptive threshold that can be applied to any pruning criterion: weight magnitude, attention score, KAN importance score, or even feature-level signals such as image pixel, and used on structural parts or weights of the models. Given any score vector $s$, EMP maps $s$ to a built-in effective number $N_{eff}$ which is inspired by the Inverse Simpson index of contributors. Retaining the $N_{eff}$ highest scoring entries and zeroing the remainder yields sparse models with performance comparable to the original dense networks across MLPs, CNNs, Transformers/LLMs, and KAN, in our experiments. By leveraging the geometry of the simplex, we derive a tight lower bound on the preserved mass $s_{eff}$ (the sum of retained scores) over the corresponding ordered probability simplex associated with the score vector $s$. We further verify the effectiveness of $N_{eff}$ by pruning the model with a scaled threshold $\beta N_{eff}$ across a variety of criteria and models. Experiments suggest that the default $\beta=1$ yields a robust threshold for model pruning while $\beta\neq1$ still serves as an optional adjustment to meet specific sparsity requirements.
Supplementary Material: zip
Primary Area: learning theory
Submission Number: 3218
Loading