Keywords: PEFT, LLMs, fine tuning, fusion Frames, frame theory, finite frame theory
TL;DR: We propose a new fine-tuning algorithm that uses Frame Theory for sparse fine-tuning
Abstract: Fine-tuning large-scale pre-trained models for downstream tasks remains a challenge, particularly as model sizes continue to grow. While Parameter-Efficient Fine-Tuning (PEFT) strategies such as Low-Rank Adaptation (LoRA) have emerged as effective solutions,
their memory requirements scale linearly with the size of the model, $\mathcal{O}(dr)$, where $d$ is the hidden dimension of the model and $r$ is the rank.
In this work, we present FrameFT, a novel PEFT method based on Fusion Frames. We model the parameter update $\Delta W$ with a sparse coefficient matrix in the Fusion Frame representation space.
It turns out that Fusion Frames can be generated algorithmically and shared across model layers, enabling highly efficient updates.
Hence, only the sparse coefficients of the basis expansion are stored/optimized, dramatically reducing the memory footprint and parameter count.
The sparse structure of the coefficient matrix in FrameFT, together with the sparsity in the Fusion Frames themselves, provides computational benefits compared to other fine-tuning methods.
Our technical analysis shows that FrameFT allows obtaining formal convergence results.
We evaluate our method across a suite of supervised fine-tuning benchmarks, primarily focusing on Language tasks, but also report applicability to Vision models. Our empirical evaluation demonstrates that FrameFT achieves performance on par with or exceeding that of state-of-the-art PEFT techniques, while requiring much fewer trainable parameters and memory.
Primary Area: unsupervised, self-supervised, semi-supervised, and supervised representation learning
Submission Number: 13035
Loading