Keywords: Federated fine-tuning, low-rank Gram matrix, Procrustes alignment
TL;DR: We propose a federated fine-tuning framework with a single low-rank Gram matrix and adopts Procrustes alignment on the decomposed matrix to improve the fine-tuning performance.
Abstract: Parameter-efficient fine-tuning techniques such as Low-rank Adaptation (LoRA) enable large language models (LLMs) to adapt to downstream tasks efficiently. Federated learning (FL) further facilitates this process by enabling collaborative fine-tuning across distributed clients without sharing private data. However, the use of two separate low-rank matrices in LoRA for federated fine-tuning introduces two types of challenges. The first challenge arises from the error induced by separately aggregating those two low-rank matrices. The second challenge occurs even when the product of two low-rank matrices is aggregated. The server needs to recover factors via matrix decomposition, which is non-unique and can introduce decomposition drift. To tackle the aforementioned challenges, we propose FLoRG, a federated fine-tuning framework which employs a single low-rank matrix for fine-tuning and aggregates its Gram matrix (i.e., the matrix of inner products of its column vectors), eliminating the aggregation error while also reducing the communication overhead. FLoRG minimizes the decomposition drift by introducing a Procrustes alignment approach which aligns the decomposed matrix between consecutive fine-tuning rounds for consistent updates. We theoretically analyze the convergence of FLoRG and prove that adopting the Procrustes alignment results in a tighter convergence bound. Experimental results across multiple LLM fine-tuning benchmarks demonstrate that FLoRG outperforms four state-of-the-art baseline schemes in the downstream task accuracy and can reduce the communication overhead by up to 82%.
Supplementary Material: pdf
Primary Area: foundation or frontier models, including LLMs
Submission Number: 21402
Loading