TQCompressor: Improving Tensor Decomposition Methods in Neural Networks Via Permutations

Vadim Abronin, Aleksei Naumov, Denis Mazur, Dmitriy Bystrov, Katerina Tsarova, Artem Melnikov, Sergey Dolgov, Reuben Brasher, Michael Perelshtein

Published: 01 Jan 2024, Last Modified: 26 Jul 2025MIPR 2024EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: We introduce TQCompressor a neural network model compression method using enhanced tensor decompositions. We propose a permutation-based improvement to Kronecker decomposition, reducing the loss in model expressivity typically associated with compression. Applied to $\mathbf{GPT-2}_{small}$, this results in the TQCompressedGPT-2 model with 81 million parameters, down from 124 million. Enhanced through multi-step knowledge distillation on 3.1% of OpenWebText, TQCompressedGPT-2 outperforms DistilGPT-2 and KnGPT-2. We made TQCompressedGPT-2 publicly available.