LoLoRA: Locally Fine-Tuned Low Rank Adapters

Dmitry Metelev; Evgenii Aleksandrovich Dzhivelikian; Petr Kuderov; Aleksander Karpov; Nikita Iltiakov; Daniil Drozdov; Aleksandr Panov; Igor Salnikov; Alexander Vladimirovich Demidovskij

LoLoRA: Locally Fine-Tuned Low Rank Adapters

Dmitry Metelev, Evgenii Aleksandrovich Dzhivelikian, Petr Kuderov, Aleksander Karpov, Nikita Iltiakov, Daniil Drozdov, Aleksandr Panov, Igor Salnikov, Alexander Vladimirovich Demidovskij

20 Sept 2025 (modified: 11 Feb 2026)Submitted to ICLR 2026EveryoneRevisionsBibTeXCC BY 4.0

Keywords: efficient fine-tuning, language models, localized learning

Abstract: Low-Rank Adaptation (LoRA) is a memory-efficient fine-tuning method for large language models (LLMs) that approximates weight updates as $\Delta W = BA$, where $B \in \mathbb{R}^{n \times r}$, $A \in \mathbb{R}^{r \times m}$ and $r \ll \min(m, n)$. To maximize memory savings, one can freeze matrix A, avoiding the storage of its input activations, but this often degrades performance. In this work, we mitigate this trade-off by introducing gradient-free updates to matrix A during the forward pass. Our method computes these updates based on the layer's immediate input, allowing it to adapt to input distribution shifts without storing activations for the backward pass. This approach maintains performance comparable to standard LoRA while further reducing the memory required for fine-tuning.

Supplementary Material: pdf

Primary Area: transfer learning, meta learning, and lifelong learning

Submission Number: 23929

Loading