Keywords: calibration data, post-training pruning, large language models
Abstract: As large language models (LLMs) are widely applied across various fields, model
compression has become increasingly crucial for reducing costs and improving
inference efficiency. Post-training pruning is a promising method that does not
require resource-intensive iterative training and only needs a small amount of
calibration data to assess the importance of parameters. Recent research has enhanced post-training pruning from different aspects but few of them systematically
explore the effects of calibration data, and it is unclear if there exist better calibration data construction strategies. We fill this blank and surprisingly observe that
calibration data is also crucial to post-training pruning, especially for high sparsity. Through controlled experiments on important influence factors of calibration
data, including the pruning settings, the amount of data, and its similarity with
pre-training data, we observe that a small size of data is adequate, and more similar data to its pre-training stage can yield better performance. As pre-training data
is usually inaccessible for advanced LLMs, we further provide a self-generating
calibration data synthesis strategy to construct feasible calibration data. Experimental results on recent strong open-source LLMs (e.g., DCLM, and LLaMA-3)
show that the proposed strategy can enhance the performance of strong pruning
methods (e.g., Wanda, DSnoT, OWL) by a large margin (up to 2.68%).
Primary Area: other topics in machine learning (i.e., none of the above)
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.
Reciprocal Reviewing: I understand the reciprocal reviewing requirement as described on https://iclr.cc/Conferences/2025/CallForPapers. If none of the authors are registered as a reviewer, it may result in a desk rejection at the discretion of the program chairs. To request an exception, please complete this form at https://forms.gle/Huojr6VjkFxiQsUp6.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 13874
Loading