Activation- and Influence-Aware Ranks (AIR): Function-Preserving SVD Compression for LLMs

Published: 01 Jun 2026, Last Modified: 10 Jun 2026AdaptFM PosterEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Machine Learning, LLM Compression, SVD, Low-Rank Approximation
TL;DR: SVD-based LLM compression guided by element-wise backward-signal influence metric, achieving >18% lower perplexity than state-of-the-art.
Abstract: We present Activation- and Influence-Aware Ranks (AIR), an SVD-based LLM compression framework that guides each weight matrix's low-rank approximation with a backward-signal influence metric. Starting from the activation-aware optimum of SVD-LLM(W), AIR runs a single closed-form alternating least squares (ALS) sweep that integrates influence element-wise under a monotone-descent guarantee. AIR is layer-local and composes orthogonally with end-to-end methods: alone it matches ACIP, and AIR+LoRA outperforms it. AIR improves perplexity over SVD-LLM(W) by $>$18% at $\leq$60% parameter retention, matches its quality with $\sim$90% less calibration data, and turns parameter savings into FLOP, peak-memory, and per-token-latency gains.
Email Sharing: We authorize the sharing of all author emails with Program Chairs.
Data Release: We authorize the release of our submission and author names to the public in the event of acceptance.
Submission Number: 154
Loading