Leveraging Function Space Aggregation for Federated Learning at Scale

Published: 13 Feb 2024, Last Modified: 13 Feb 2024Accepted by TMLREveryoneRevisionsBibTeX
Authors that are also TMLR Expert Reviewers: ~Zachary_Charles1
Abstract: The federated learning paradigm has motivated the development of methods for aggregating multiple client updates into a global server model, without sharing client data. Many federated learning algorithms, including the canonical Federated Averaging (FedAvg), take a direct (possibly weighted) average of the client parameter updates, motivated by results in distributed optimization. In this work, we adopt a function space perspective and propose a new algorithm, FedFish, that aggregates local approximations to the functions learned by clients, using an estimate based on their Fisher information. We evaluate FedFish on realistic, large-scale cross-device benchmarks. While the performance of FedAvg can suffer as client models drift further apart, we demonstrate that FedFish is more robust to longer local training. Our evaluation across several settings in image and language benchmarks shows that FedFish outperforms FedAvg as local training epochs increase. Further, FedFish results in global networks that are more amenable to efficient personalization via local fine-tuning on the same or shifted data distributions. For instance, federated pretraining on the C4 dataset, followed by few-shot personalization on Stack Overflow, results in a 7% improvement in next-token prediction by FedFish over FedAvg.
Certifications: Expert Certification
Submission Length: Regular submission (no more than 12 pages of main content)
Assigned Action Editor: ~Anastasios_Kyrillidis2
License: Creative Commons Attribution 4.0 International (CC BY 4.0)
Submission Number: 1827