Keywords: Federated Learning
TL;DR: We achieve parallel computing between the central server and the edge nodes in Federated Learning.
Abstract: Federated Learning (FL) is a cutting-edge distributed machine learning framework that enables multiple devices to collaboratively train a shared model without exposing their own data. In the scenario of device heterogeneity, the synchronous FL suffers from latency bottleneck induced by network stragglers, which hampers the training efficiency significantly. In addition, due to the diverse structures and sizes of local models, the simple and fast averaging aggregation is not feasible anymore. Instead, complicated aggregation operation, such as knowledge distillation, is required. The time cost for complicated aggregation becomes a new bottleneck that limits the computational efficiency of FL.
In this work, we claim that the cause root of training latency actually lies in the aggregation-then-broadcasting workflow of the server. By swapping the computational order of aggregation and broadcasting, we propose a new parallel federated learning (PFL) framework, which unlocks the edge nodes during global computation and the central server during local computation. This fully asynchronous and parallel pipeline enables handling device heterogeneity and network stragglers, allowing flexible device participation as well as achieving scalability in computation.
We theoretically prove that PFL can achieve the similar convergence rate as synchronous FL, and empirically show that our framework can tolerate both stragglers and complicated aggregation tasks, which brings $1.77\times$ to $7.32\times$ speedup.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics
Submission Guidelines: Yes
Please Choose The Closest Area That Your Submission Falls Into: Social Aspects of Machine Learning (eg, AI safety, fairness, privacy, interpretability, human-AI interaction, ethics)
Supplementary Material: zip
16 Replies
Loading