Pisces: efficient federated learning via guided asynchronous trainingOpen Website

Published: 01 Jan 2022, Last Modified: 11 May 2023SoCC 2022Readers: Everyone
Abstract: Federated learning (FL) is typically performed in a synchronous parallel manner, and the involvement of a slow client delays the training progress. Current FL systems employ a participant selection strategy to select fast clients with quality data in each iteration. However, this is not always possible in practice, and the selection strategy has to navigate a knotty tradeoff between the speed and the data quality. This paper makes a case for asynchronous FL by presenting Pisces, a new FL system with intelligent participant selection and model aggregation for accelerated training despite slow clients. To avoid incurring excessive resource cost and stale training computation, Pisces uses a novel scoring mechanism to identify suitable clients to participate in each training iteration. It also adapts the aggregation pace dynamically to bound the progress gap between the participating clients and the server, with a provable convergence guarantee in a smooth non-convex setting. We have implemented Pisces in an open-source FL platform, Plato, and evaluated its performance in large-scale experiments with popular vision and language models. Pisces outperforms the state-of-the-art synchronous and asynchronous alternatives, reducing the time-to-accuracy by up to 2.0X and 1.9X, respectively.
0 Replies

Loading