Keywords: Fine-tuning, Parameter-Efficient Fine-tuning, Large Language Models, Foundation Models
TL;DR: A new parameter-efficient fine-tuning method that matches the performance of LoRA and full fine-tuning while training only n+m parameters for a weight matrix of size n by m, offering significant savings in compute and memory
Abstract: Foundation models achieve strong general performance on a wide variety of tasks, but fine-tuning is often necessary for tasks requiring specialized outputs, constraints, or data. However, fine-tuning the entire model can be computationally prohibitive due to the large number of parameters. In this paper, we introduce HyperAdapt, a parameter-efficient fine-tuning method that significantly reduces the number of trainable parameters compared to state-of-the-art methods like LoRA. Specifically, HyperAdapt fine-tunes a pre-trained weight matrix by applying row-wise and column-wise scaling via diagonal matrices, requiring only $n+m$ trainable parameters for an $n \times m$ matrix. Empirically, we show that \name{} achieves performance comparable to full fine-tuning and existing parameter-efficient methods on widely-used reasoning and arithmetic benchmarks with significantly fewer trainable parameters.
Serve As Reviewer: ~Abel_Gurung1
Submission Number: 57
Loading