Keywords: model compression, model merging, transformers, parameter efficiency, redundancy
TL;DR: We propose a novel and effective Transformer compression method that selects, aligns, and merges feed-forward layers.
Abstract: With the rise and ubiquity of larger deep learning models, the need for high-quality compression techniques has been growing in order to deploy these models widely. The sheer parameter count of some models makes it difficult to fit them into the memory constraints of different hardware. In this work, we present a novel approach to model compression by merging similar parameter groups within a model, rather than pruning away less important parameters. Specifically, we propose a straightforward method for selecting, aligning, and merging separate feed-forward sublayers in Transformer models, and test our method on a language modeling task, image classification, and machine translation. With our method, we demonstrate performance comparable to the original models across our three diverse tasks while combining more than a third of model feed-forward sublayers. For instance, we can remove over 21\% of total parameters from a Vision Transformer, while maintaining 99\% of its original performance. Additionally, we observe that some feed-forward sublayers often exhibit regions of high similarity between their activations, which may help explain their surprising mergeability.
Primary Area: applications to computer vision, audio, language, and other modalities
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.
Reciprocal Reviewing: I understand the reciprocal reviewing requirement as described on https://iclr.cc/Conferences/2025/CallForPapers. If none of the authors are registered as a reviewer, it may result in a desk rejection at the discretion of the program chairs. To request an exception, please complete this form at https://forms.gle/Huojr6VjkFxiQsUp6.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 11946
Loading