CE-LoRA: Computation-Efficient LoRA Fine-Tuning for Large Language Models

CE-LoRA: Computation-Efficient LoRA Fine-Tuning for Large Language Models

02 Feb 2026 (modified: 11 Mar 2026)Withdrawn by AuthorsEveryoneRevisionsBibTeXCC BY 4.0

Abstract: Large Language Models (LLMs) demonstrate exceptional performance across various tasks but demand substantial computational resources even for fine-tuning computation. Although Low-Rank Adaptation (LoRA) significantly alleviates memory consumption during fine-tuning, its impact on computational cost reduction is limited. This paper identifies the computation of activation gradients as the primary bottleneck in LoRA's backward propagation and introduces the Computation-Efficient LoRA (CE-LoRA) algorithm, which enhances computational efficiency while preserving memory efficiency. CE-LoRA leverages two key techniques: Approximated Matrix Multiplication, which replaces dense multiplications of large and complete matrices with sparse multiplications involving only critical rows and columns, and the Double-LoRA technique, which reduces error propagation in activation gradients. Theoretically, CE-LoRA converges at the same rate as LoRA,$\mathcal{O}(1/\sqrt{T})$, where $T$ is the number of iterations. Empirical evaluations confirm that CE-LoRA significantly reduces computational costs compared to LoRA without notable performance degradation.

Submission Type: Regular submission (no more than 12 pages of main content)

Assigned Action Editor: ~Bamdev_Mishra1

Submission Number: 7292

Loading