UniSVD: Unilateral Weight Decomposition for Attention-based Vision Models

11 Sept 2025 (modified: 11 Feb 2026)Submitted to ICLR 2026EveryoneRevisionsBibTeXCC BY 4.0
Keywords: Model Compression, Singular Value Decomposition, Low-rank Approximation
TL;DR: In this paper, we propose Unilateral Singular Value Decomposition (UniSVD), a novel and effective method that applies decomposition to only one side of the $Q$-$K$ and $V$-$O$ weights in a head-wise manner.
Abstract: Transformers have achieved remarkable success across diverse domains, but their ever-growing scale results in prohibitive computational and memory costs. Low-rank matrix decomposition with Singular Value Decomposition (SVD) has emerged as an effective compression technique. Recent studies, such as ASVD, SVD-LLM, and FLAR-SVD, have improved decomposition quality by incorporating activation-aware method. However, these methods do not consider the unique mechanism of MHA, where query-key ($Q$-$K$) and value–output ($V$-$O$) computations are linear and allow pre-computation. To effectively leverage this mechanism, we propose Unilateral Singular Value Decomposition (UniSVD), a novel framework that applies decomposition to only one side of the $Q$-$K$ or $V$-$O$ weight pairs in a head-wise manner. Since $Q$-$K$-$V$-$O$ weights exhibit varying sensitivities to low-rank approximation across heads and layers, UniSVD adaptively selects which side to be decomposed according to the rank sensitivity, thereby preserving the important information of weights. Extensive experiments demonstrate that UniSVD seamlessly integrates with existing decomposition methods and consistently achieves superior trade-offs between parameter reduction, FLOPs, and model performance.
Supplementary Material: zip
Primary Area: optimization
Submission Number: 4086
Loading