Abstract: In this article, we propose and analyze FastSlowMo, a combined Nesterov accelerated gradient (NAG) style worker momentum and aggregator momentum algorithm for federated learning (FL). Existing NAG momentum-based FL algorithms depend on either worker momentum only or aggregator momentum only, which may be inefficient due to infrequent usage of momentum, data heterogeneity, and out-of-date momentum. FastSlowMo combines the advantages of worker momentum and aggregator momentum to address these issues. We then provide mathematical proof for the convergence of FastSlowMo. Finally, extensive experiments based on real-world datasets and trace-driven simulation are conducted, verifying that FastSlowMo decreases the total training time by 3–61%, and outperforms existing mainstream benchmarks under a wide range of settings.
Loading