Adaptive Node Participation for Straggler-Resilient Federated Learning

Amirhossein Reisizadeh, Isidoros Tziotis, Hamed Hassani, Aryan Mokhtari, Ramtin Pedarsani

2022 (modified: 16 Apr 2023)ICASSP 2022Readers: Everyone

Abstract: Federated learning is prone to multiple system challenges including system heterogeneity where clients have different computation and communication capabilities. Such heterogeneity in clients’ computation speeds has a negative effect on the scalability of federated learning algorithms and causes significant slow-down in their runtime due to the existence of stragglers. In this chapter, we propose a novel straggler-resilient federated learning method that incorporates statistical characteristics of the clients’ data to adaptively select the clients in order to speed up the learning procedure. The key idea of our algorithm is to start the training procedure with faster nodes and gradually involve the slower nodes in the model training once the statistical accuracy of the data corresponding to the current participating nodes is reached. The proposed approach reduces the overall runtime required to achieve the statistical accuracy of data of all nodes, as the solution for each stage is close to the solution of the subsequent stage with more samples and can be used as a warm-start. Our numerical experiments demonstrate significant speedups in wall-clock time of our straggler-resilient method compared to other federated learning benchmarks.

0 Replies