RoCoFT: Efficient Finetuning of Large Language Models with Row-Column Updates

Md Kowsher; Tara Esmaeilbeig; Chun-Nam Yu; Mojtaba Soltanalian; Niloofar Yousefi

RoCoFT: Efficient Finetuning of Large Language Models with Row-Column Updates

Md Kowsher, Tara Esmaeilbeig, Chun-Nam Yu, Mojtaba Soltanalian, Niloofar Yousefi

Published: 10 Oct 2024, Last Modified: 30 Nov 2024FITML 2024 OralEveryoneRevisionsBibTeXCC BY 4.0

Keywords: RoCoFT, Parameter-efficient finetuning, LLMs, Neural Tangent Kernel

TL;DR: parameter-efficient finetuning method for transformers of medium-size LMs like BERT and Roberta, and larger LMs like Bloom-7B, Llama-7B, and Llama-13B

Abstract: We propose RoCoFT, a parameter-efficient fine-tuning method for large language models based on updating only a few rows and columns of the weight matrices in transformers. Through extensive experiments with medium size LMs like BERT and RoBERTa, and larger LMs like Bloom-7B, Llama2-7B and Llama2-13B, we show that our method gives comparable or better accuracies than state-of-the-art PEFT methods while also being more memory and computationally-efficient. We also study the reason behind the effectiveness of our method with tools from neural tangent kernel theory. We empirically demonstrate that our kernel, constructed using a restricted set of row and column parameters, is numerically close to the full-parameter kernel and gives comparable classification performance. Ablation studies are conducted to investigate the impact of different algorithmic choices, including the selection strategy for rows and columns as well as the optimal rank for effective implementation of our method.

Submission Number: 96

Loading