TL;DR: An approach to generate fine-tuning weights from pretrained weights while preserving compute and memory efficiency
Abstract: Fine-tuning large pretrained Transformer models can focus on either introducing a small number of new learnable parameters (parameter efficiency) or editing representations of a small number of tokens using lightweight modules (representation efficiency). While the pioneering method LoRA (Low-Rank Adaptation) inherently balances parameter, compute, and memory efficiency, many subsequent variants trade off compute and memory efficiency and/or performance to further reduce fine-tuning parameters. To address this limitation and unify parameter-efficient and representation-efficient fine-tuning, we propose Weight-Generative Fine-Tuning (WeGeFT, pronounced *wee-gift*), a novel approach that **learns to generate fine-tuning weights directly from the pretrained weights**. WeGeFT employs a simple low-rank formulation consisting of two linear layers, either shared across multiple layers of the pretrained model or individually learned for different layers. This design achieves multi-faceted efficiency in parameters, representations, compute, and memory, while maintaining or exceeding the performance of LoRA and its variants. Extensive experiments on commonsense reasoning, arithmetic reasoning, instruction following, code generation, and visual recognition verify the effectiveness of our proposed WeGeFT.
Lay Summary: Modern AI language models are extremely capable, but adapting them to new tasks can be resource-heavy — requiring lots of memory, computing power, and to change the many internal parameters. To make this easier, researchers have developed techniques that aim to update only a small number of these parameters, making the fine-tuning process more efficient.
One popular method, called LoRA (Low-Rank Adaptation), strikes a strong balance: it keeps the number of new parameters low and remains efficient in terms of memory, speed, and performance. However, many newer methods reduce the number of added parameters even further — but at the cost of using more memory, more computation, or losing accuracy.
We created WeGeFT (short for Weight-Generative Fine-Tuning, and pronounced as wee-gift), a new approach that keeps LoRA’s broad efficiency benefits while reducing the number of added parameters even more. It learns how to generate the necessary updates directly from the original model’s knowledge, using a simple and compact design. Despite being lightweight, WeGeFT matches or outperforms LoRA on a wide range of tasks — from arithmetic and commonsense reasoning, following instructions to coding and image recognition — making it a powerful and efficient tool for tuning AI models.
Link To Code: https://github.com/savadikarc/wegeft
Primary Area: General Machine Learning->Transfer, Multitask and Meta-learning
Keywords: Parameter-Efficient Fine-Tuning
Submission Number: 8250
Loading