Enhancing Zeroth-Order Fine-Tuning for LLMs via Gradient-Guided Subspace Selection

Tongyifan Lin; Pengyu Zhang; Ming Hu; Xian Wei; Mingsong Chen

Enhancing Zeroth-Order Fine-Tuning for LLMs via Gradient-Guided Subspace Selection

Tongyifan Lin, Pengyu Zhang, Ming Hu, Xian Wei, Mingsong Chen

03 May 2025 (modified: 29 Oct 2025)Submitted to NeurIPS 2025EveryoneRevisionsBibTeXCC BY 4.0

Keywords: LLM, Fine-Tuning

Abstract: As a promising memory-efficient technique, zeroth-order (ZO) optimization enables large language models (LLMs) to bypass costly backpropagation during fine-tuning by estimating gradients through function evaluations. However, to minimize approximate variance in high-dimensional parameter spaces, existing ZO methods focus on exploring the estimate of gradients within random subspaces, neglecting the benefits of searching for more accurate subspaces of LLMs on gradient estimates. Due to inaccurate gradient estimates obtained from random spaces, fine-tuning performance is inevitably degraded, thus compromising the performance of downstream tasks. To address the limitation of existing ZO methods, this paper proposes a novel ZO subspace fine-tuning method named *SVD-0*. Based on singular value decomposition (SVD), SVD-0 can effectively obtain more accurate subspace projection matrices, which can be used to improve the accuracy of gradient estimates. Experimental results on various complex language modeling tasks show that SVD-0 achieves better fine-tuning performance and faster convergence than state-of-the-art ZO methods.

Supplementary Material: zip

Primary Area: Deep learning (e.g., architectures, generative models, optimization for deep networks, foundation models, LLMs)

Submission Number: 6385

Loading