Transformers Compression: A Study of Matrix Decomposition Methods Using Fisher Information

Published: 01 Jan 2023, Last Modified: 21 May 2025AIST 2023EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Transformer models have been a breakthrough in Natural Language Processing. However, the performance of these models comes with their enormous size, limiting options for their deployment. Facing this issue, in this paper, we compare different compression techniques, such as low-rank matrix and tensor factorization, for compressing these heavy layers. We focus on Singular Value Decomposition (SVD) and Tensor Train Matrix Decomposition (TTM) and extend previous work [10] by incorporating Fisher information into the TTM, introducing a novel approach which we call FWTTM.
Loading