Keywords: parameter-efficient fine-tuning, multi-task optimization, subset selection, Pareto optimal
Abstract: Parameter-efficient fine-tuning (PEFT) is a highly effective approach for adapting large pre-trained models to downstream tasks with minimal computational overhead. At the core, PEFT methods freeze most parameters and only trains a small subset (say $<0.1\%$ of total parameters). Notably, different PEFT methods select different subsets of parameters and result in varying performances. This variation prompts a key question: how to adaptively select the most influential subset?
We formulate the subset selection as a multi-task problem: maximizing the performance and minimizing the number of trainable parameters, which consists of both discrete and continuous objectives. We leverage a series of transformations -- including $\epsilon$-constraint method and second-order Taylor approximation -- to arrive at the classical 0-1 knapsack problem, which we solve via the lens of Pareto optimality. Consequently, we propose AdaPEFT, an efficient and scalable algorithm for PEFT that adapts to various tasks, in which our subset selection is consistent as the training horizons and model sizes scale up over $50\times$.
Primary Area: foundation or frontier models, including LLMs
Submission Number: 1371
Loading