Enhancing Fine-Tuning Efficiency of LLMs Through Gradient Subspace Tracking

Sahar Rajabi; Sirisha Rambhatla

Enhancing Fine-Tuning Efficiency of LLMs Through Gradient Subspace Tracking

Sahar Rajabi, Sirisha Rambhatla

Published: 10 Oct 2024, Last Modified: 19 Nov 2024AFM 2024 PosterEveryoneRevisionsBibTeXCC BY 4.0

Keywords: efficient fine-tuning, large language models, subspace tracking, optimization, memory-efficient fine-tuning

Abstract: Training and fine-tuning Large Language Models (LLMs) require substantial computational resources and time due to their large model sizes and optimizer states. To address these challenges and enhance accessibility, several memory-efficient techniques have been introduced. For instance, Low-Rank Adaptation (LoRA) optimizes model weights within a low-rank subspace, while Gradient Low-Rank Projection (GaLore) reduces the memory footprint by projecting gradients into a lower-dimensional space. In this paper, we introduce Gradient Subspace Tracking (SubTrack), a method that restricts optimization to a compact core subspace of the gradient matrices, and efficiently updates its subspace estimation by leveraging estimation errors and previously identified subspaces. Our results show that even with rank-1 updates to the underlying subspace, SubTrack achieves performance comparable to or better than GaLore, while reducing runtime by an average of 15% and up to 20.56% on some datasets.

Submission Number: 63

Loading