LoLoRA: Locally Fine-Tuned Low Rank Adapters

20 Sept 2025 (modified: 11 Feb 2026)Submitted to ICLR 2026EveryoneRevisionsBibTeXCC BY 4.0
Keywords: efficient fine-tuning, language models, localized learning
Abstract: Low-Rank Adaptation (LoRA) is a memory-efficient fine-tuning method for large language models (LLMs) that approximates weight updates as $\Delta W = BA$, where $B \in \mathbb{R}^{n \times r}$, $A \in \mathbb{R}^{r \times m}$ and $r \ll \min(m, n)$. To maximize memory savings, one can freeze matrix A, avoiding the storage of its input activations, but this often degrades performance. In this work, we mitigate this trade-off by introducing gradient-free updates to matrix A during the forward pass. Our method computes these updates based on the layer's immediate input, allowing it to adapt to input distribution shifts without storing activations for the backward pass. This approach maintains performance comparable to standard LoRA while further reducing the memory required for fine-tuning.
Supplementary Material: pdf
Primary Area: transfer learning, meta learning, and lifelong learning
Submission Number: 23929
Loading