Momentum Approximation in Asynchronous Private Federated Learning

Tao Yu; Congzheng Song; Jianyu Wang; Mona Chitnis

Momentum Approximation in Asynchronous Private Federated Learning

Tao Yu, Congzheng Song, Jianyu Wang, Mona Chitnis

Published: 01 Oct 2024, Last Modified: 17 Oct 2024FL@FM-NeurIPS'24 OralEveryoneRevisionsBibTeXCC0 1.0

Keywords: Asynchronous Federated Learning, Differential Privacy

Abstract: Asynchronous protocols have been shown to improve the scalability of federated learning (FL) with a massive number of clients. Meanwhile, momentum-based methods can achieve the best model quality in synchronous FL. However, naively applying momentum in asynchronous FL algorithms leads to slower convergence and degraded model performance. It is still unclear how to effective combinie these two techniques together to achieve a win-win. In this paper, we find that asynchrony introduces implicit bias to momentum updates. In order to address this problem, we propose momentum approximation that minimizes the bias by finding an optimal weighted average of all historical model updates. Momentum approximation is compatible with secure aggregation as well as differential privacy, and can be easily integrated in production FL systems with a minor communication and storage cost. We empirically demonstrate that on benchmark FL datasets, momentum approximation can achieve $1.15 \textrm{--}4\times$ speed up in convergence compared to naively combining asynchronous FL with momentum.

Submission Number: 2

Loading