Projected Compression: Trainable Projections for Efficient Transformer Compression

Maciej Stefaniak; Michał Krutul; Mikołaj Dziok; Jakub Zarzycki; Jan Małaśnicki; Maciej Pióro; Sebastian Jaszczur; Marek Cygan; Kamil Adamczewski; Jan Ludziejewski

Projected Compression: Trainable Projections for Efficient Transformer Compression

Maciej Stefaniak, Michał Krutul, Mikołaj Dziok, Jakub Zarzycki, Jan Małaśnicki, Maciej Pióro, Sebastian Jaszczur, Marek Cygan, Kamil Adamczewski, Jan Ludziejewski

15 Sept 2025 (modified: 03 Dec 2025)ICLR 2026 Conference Withdrawn SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: compression, llm

TL;DR: Projected Compression, a method that compresses Transformers using trainable projection modules over frozen base weights.

Abstract: Large language models have steadily increased in size to achieve improved performance; however, this growth has also led to greater inference time and computational demands. Consequently, there is rising interest in model size reduction methods. To address this issue, we propose \textbf{Projected Compression}, a novel model compression technique, that reduces model weights by utilizing projection modules. Specifically, we first train additional projection weights and preserve access to all the original model parameters. Subsequently, these projections are combined into a lower-dimensional product matrix, resulting in a reduced-size standard Transformer-based model. Unlike alternative approaches that require additional computational overhead, our method matches the per-token computation cost of training a compressed model. Experimental results show that Projected Compression performs especially well with increasing compression rates as high as 90\% compared to other compression methods.

Primary Area: foundation or frontier models, including LLMs

Submission Number: 5914

Loading