Keywords: language models, interpretability
TL;DR: This paper proposes a technique to enhance the efficiency of language models by utilizing their interpretability.
Abstract: As the size of language models increases, they deliver substantial performance improvements across a variety of applications. However, this growth also leads to greater computational demands, making deployment on resource-constrained devices—such as personal computers and mobile or wearable devices—more challenging, and significantly raising inference costs on cloud servers. To address these challenges, we introduce a method to streamline language models. We observe that language models pretrained on general datasets often include redundant components that are unnecessary for particular tasks. Our approach identifies and removes these redundant parts, retaining only the essential components for the intended applications. Specifically, we represent the weight matrices of language models as a linear combination of base components, eliminate the irrelevant bases, and introduce new bases that enhance performance for target tasks. Evaluations show that our method reduces model size much more significantly—by up to 1.7 times—while maintaining similar accuracy, compared to state-of-the-art techniques, across a range of applications.
Primary Area: interpretability and explainable AI
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.
Reciprocal Reviewing: I understand the reciprocal reviewing requirement as described on https://iclr.cc/Conferences/2025/CallForPapers. If none of the authors are registered as a reviewer, it may result in a desk rejection at the discretion of the program chairs. To request an exception, please complete this form at https://forms.gle/Huojr6VjkFxiQsUp6.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 12375
Loading