GMFL: Efficient Global Masking for Federated LLM Fine-tuning

GMFL: Efficient Global Masking for Federated LLM Fine-tuning

ACL ARR 2026 January Submission3252 Authors

04 Jan 2026 (modified: 20 Mar 2026)ACL ARR 2026 January SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: federated learning, large language model, fine-tuning, communication efficiency

Abstract: Low-Rank Adaptation (LoRA) has emerged as a prominent solution to mitigate the communication and computation costs in federated fine-tuning of Large Language Models (LLMs). However, we observe that even within low-rank adapters, a substantial portion of parameters manifest negligible updates during federated training, leading to redundant communication and wasted local computation. To address this, we propose \textbf{GMFL}, a \textbf{plug-and-play} layer freezing mechanism designed to \textbf{seamlessly integrate} with existing federated fine-tuning frameworks. Specifically, the server monitors the global update magnitude of each LoRA layer to dynamically generate freezing masks. These masks are updated periodically with a fixed freezing rate, ensuring stable convergence by robustly identifying ``saturated'' layers. Theoretical analysis confirms the convergence of GMFL, where the freezing mechanism yields a bounded error that scales with client heterogeneity. Extensive experiments across multiple tasks (GLUE, Commonsense Reasoning, Math Reasoning and General Generation) demonstrate that GMFL reduces communication overhead and lowers computational costs while preserving the performance of the underlying federated fine-tuning methods. Our work provides a practical, versatile solution for deploying large-scale federated LLM fine-tuning in resource-constrained environments.

Paper Type: Long

Research Area: Low-resource Methods for NLP

Research Area Keywords: parameter-efficient-training, NLP in resource-constrained settings

Contribution Types: Approaches to low-resource settings

Languages Studied: English

Submission Number: 3252

Loading