Improving Accelerated Federated Learning with Compression and Importance Sampling
Keywords: Machine Learning, Federated Learning, Partial Participation, Gradient Compression
TL;DR: We introduce an accelerated local method that supports client sampling and compression
Abstract: Federated Learning is a collaborative training framework that leverages heterogeneous data distributed across a vast number of clients. Since it is practically infeasible to request and process all clients during the aggregation step, partial participation must be supported. In this setting, the communication between the server and clients poses a major bottleneck. To reduce communication loads, there are two main approaches: compression and local steps. Recent work by Mishchenko et al. (2022) introduced the new ProxSkip method, which achieves an accelerated rate using the local steps technique. Follow-up works successfully combined local steps acceleration with partial participation (Grudzień et al., 2023; Condat et al., 2023) and gradient compression (Condat et al., 2022). In this paper, we finally present a complete method for Federated Learning that incorporates all necessary ingredients: Local Training, Compression, and Partial Participation. Moreover, we analyze the general sampling framework for partial participation and derive an importance sampling scheme, which leads to even better performance. We experimentally demonstrate the advantages of the proposed method in practice.
Submission Number: 70