Keywords: Large Model Pruning, Model Compression, Compensation
Abstract: The increasing prevalence of large-scale models, both in vision and language domains, presents significant challenges in terms of memory and resource consumption. While model pruning is an effective method for compressing models to alleviate these constraints, existing techniques either require extensive fine-tuning, which is resource-intensive, or perform well only at low sparsity levels (10%-50%), failing at high sparsity levels (50%-90%). To address these issues, this paper introduces LAMP to mitigate the drawbacks associated with traditional pruning methods, namely high resource consumption in methods that require extensive fine-tuning, and poor performance at high sparsity levels in methods that do not. It reduces memory overhead and alleviates performance degradation at high sparsity. Experimental results demonstrate that LAMP achieves slightly better performance than SparseGPT at low sparsity levels and significantly better at high sparsity levels in both language and vision models, without significantly increasing memory consumption when compared to SparseGPT.
Primary Area: other topics in machine learning (i.e., none of the above)
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.
Reciprocal Reviewing: I understand the reciprocal reviewing requirement as described on https://iclr.cc/Conferences/2025/CallForPapers. If none of the authors are registered as a reviewer, it may result in a desk rejection at the discretion of the program chairs. To request an exception, please complete this form at https://forms.gle/Huojr6VjkFxiQsUp6.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 9380
Loading