Abstract: Fine-tuning large language models (LLMs) on decentralized data offers opportunities while also posing challenges, especially concerning data privacy and reducing overhead. Although federated learning (FL) combined with parameter-efficient methods like low-rank adaptation (LoRA) has shown promise, current approaches often necessitate multiple communication rounds to mitigate client drift, resulting in significant communication and computation overhead. To address these challenges, we propose a novel one-shot parameter-efficient federated tuning (OnePeFT) framework for LLMs that views global model aggregation as heterogeneous knowledge alignment.
In this framework, each client applies LoRA to its local model while training only the adapters on domain-specific data, then uploads the adapters to the server with one-round communication. The server uses a novel SVD-based aggregation for low-rank reparameterization to create a global initialization. The global adapter is refined via distillation with a public task-agnostic dataset, aligning shared semantics across clients to reduce bias and enhance generalization and domain-specific performance. Extensive experiments on LLaMA3-8B and Qwen2-7B show that OnePeFT achieves the state-of-the-art performance while significantly reducing communication overhead up to 20$\times$.
Paper Type: Long
Research Area: Efficient/Low-Resource Methods for NLP
Research Area Keywords: LLM Efficiency,parameter-efficient-training,distillation
Contribution Types: NLP engineering experiment
Languages Studied: English,Chinese
Keywords: LLM, One-Shot Federated Learning, Parameter-Efficient Fine-Tuning, Distillation
Submission Number: 807
Loading