Abstract: In this article, we propose Hierarchical Federated Learning with Momentum Acceleration (HierMo), a three-tier worker-edge-cloud federated learning algorithm that applies momentum for training acceleration. Momentum is calculated and aggregated in the three tiers. We provide convergence analysis for HierMo, showing a convergence rate of $\mathcal {O}(\frac{1}{T})$ . In the analysis, we develop a new approach to characterize model aggregation, momentum aggregation, and their interactions. Based on this result, we prove that HierMo achieves a tighter convergence upper bound compared with HierFAVG without momentum. We also propose HierOPT, which optimizes the aggregation periods (worker-edge and edge-cloud aggregation periods) to minimize the loss given a limited training time. By conducting the experiment, we verify that HierMo outperforms existing mainstream benchmarks under a wide range of settings. In addition, HierOPT can achieve a near-optimal performance when we test HierMo under different aggregation periods.
Loading