An Efficient Row-Based Sparse Fine-Tuning with Low Quantization Error

Cen-Jhih Li; Aditya Bhaskara

An Efficient Row-Based Sparse Fine-Tuning with Low Quantization Error

Cen-Jhih Li, Aditya Bhaskara

Published: 11 Jun 2025, Last Modified: 10 Jul 2025ES-FoMo IIIEveryoneRevisionsBibTeXCC BY 4.0

Keywords: PEFT, Quantization, Compression, Finetuning, Foundation Model, LLM training

TL;DR: We propose a PEFT approach with low quantization error.

Abstract: Fine-tuning is essential for adapting large language models to downstream tasks, but can be costly for users with limited resources. To address this, Sparse Fine-tuning (SpFT) and Low-rank adaptation (LoRA) have been widely adopted for efficient fine-tuning. In this work, we propose a new SpFT framework inspired by neural network pruning: we identify important neurons using structural pruning and fine-tune only the associated weights. Experiments on common language tasks show our method improves SpFT’s memory efficiency by 20–50\% while matching the accuracy of state-of-the-art methods like LoRA's variants.

Submission Number: 27

Loading