Momentum-Driven Adaptivity: Towards Tuning-Free Asynchronous Federated Learning

Published: 01 May 2025, Last Modified: 18 Jun 2025ICML 2025 posterEveryoneRevisionsBibTeXCC BY 4.0
Abstract: Asynchronous federated learning (AFL) has emerged as a promising solution to address system heterogeneity and improve the training efficiency of federated learning. However, existing AFL methods face two critical limitations: 1) they rely on strong assumptions about bounded data heterogeneity across clients, and 2) they require meticulous tuning of learning rates based on unknown system parameters. In this paper, we tackle these challenges by leveraging momentum-based optimization and adaptive learning strategies. We first propose MasFL, a novel momentum-driven AFL framework that successfully eliminates the need for data heterogeneity bounds by effectively utilizing historical descent directions across clients and iterations. By mitigating the staleness accumulation caused by asynchronous updates, we prove that MasFL achieves state-of- the-art convergence rates with linear speedup in both the number of participating clients and local updates. Building on this foundation, we further introduce AdaMasFL, an adaptive variant that incorporates gradient normalization into local updates. Remarkably, this integration removes all dependencies on problem-specific parameters, yielding a fully tuning-free AFL approach while retaining theoretical guarantees. Extensive experiments demonstrate that AdaMasFL consistently outperforms state-of-the-art AFL methods in run- time efficiency and exhibits exceptional robustness across diverse learning rate configurations and system conditions.
Lay Summary: Asynchronous Federated Learning (AFL) is a promising approach to address system heterogeneity and improve training efficiency in federated learning. However, existing AFL methods face two major challenges: they rely on strong assumptions about bounded data heterogeneity across clients and require tedious tuning of learning rates based on unknown system parameters. To address these issues, we propose MasFL, a momentum-driven AFL framework that eliminates the need for heterogeneity bounds by leveraging historical descent directions across clients and iterations. By mitigating staleness accumulation caused by asynchronous updates, MasFL achieves state-of-the-art convergence rates with linear speedup in the number of clients and local updates. Building on this framework, we introduce AdaMasFL, an adaptive variant that incorporates gradient normalization into local updates. This fully tuning-free approach removes all dependencies on problem-specific parameters while retaining strong theoretical guarantees. Extensive experiments demonstrate that AdaMasFL consistently outperforms state-of-the-art AFL methods in runtime efficiency and offers exceptional robustness across diverse learning rate configurations and system conditions.
Application-Driven Machine Learning: This submission is on Application-Driven Machine Learning.
Primary Area: Optimization
Keywords: asynchronous federated learning, arbitrary data heterogeneity, tuning-free, parameter-free, momentum, gradient normalization
Submission Number: 4415
Loading