Memory-Efficient Fine-Tuning via Low-Rank Activation Compression

Jiang-Xin Shi; Wen-Da Wei; Jin-Fei Qi; Xuanyu Chen; Tong Wei; Yu-Feng Li

Memory-Efficient Fine-Tuning via Low-Rank Activation Compression

Jiang-Xin Shi, Wen-Da Wei, Jin-Fei Qi, Xuanyu Chen, Tong Wei, Yu-Feng Li

16 Sept 2025 (modified: 11 Feb 2026)Submitted to ICLR 2026EveryoneRevisionsBibTeXCC BY 4.0

Keywords: memory-efficient fine-tuning, activation compression, low-rank decomposition

TL;DR: We propose a novel method LoRAct that can significantly reduce the activation memory cost while maintaining performance.

Abstract: The parameter-efficient fine-tuning paradigm has garnered significant attention with the advancement of foundation models. Although numerous methods have been proposed to reduce the number of trainable parameters, their substantial memory overhead remains a critical bottleneck that hinders practical deployment. In this paper, we observe that model activations constitute a major source of memory consumption, especially under large batch sizes and long context lengths; however, the rank of the activations remains consistently low. Motivated by this insight, we propose a memory-efficient fine-tuning approach Low-Rank Activation Compression (LoRAct). Unlike prior work, LoRAct provides a more flexible and versatile compressing strategy that can be applied online during the forward pass without the need for any calibration data. Moreover, LoRAct incorporates a novel sampling-based orthogonal decomposition algorithm specifically designed for low-rank matrices, offering improved computational efficiency and a tighter error bound compared to the widely used RSVD. Experiments on both vision and language tasks demonstrate the effectiveness of LoRAct. Notably, LoRAct further reduces activation memory by approximately 80\% in comparison with the widely adopted LoRA method, while maintaining competitive performance. The source code is available in the supplementary material.

Supplementary Material: zip

Primary Area: foundation or frontier models, including LLMs

Submission Number: 7380

Loading