everyone
since 04 Oct 2024">EveryoneRevisionsBibTeXCC BY 4.0
Hierarchical Federated Learning (HFL) advances the classic Federated Learning (FL) by introducing the multi-layer architecture between clients and the central server, in which edge servers aggregate models from respective clients and further send to the central server. Instead of directly uploading each update from clients for aggregation, the HFL not only reduces the communication and computational overhead but also greatly enhances the scalability of supporting a massive number of clients. When HFL operates for applications having a large-scale clients, edge servers train their models in a cyclic pattern (a ring architecture) as opposed to the star-type of architecture where each edge develops their own models independently.We refer it as Cyclic HFL(CHFL). Driven by its promising feature of handling data heterogeneity and resiliency, CHFL has a great potential to be deployed in practice. Unfortunately, the thorough convergence analysis on CHFL remains lacking, especially considering the widely-existing data heterogeneity issue among clients. To the best of our knowledge, we are the first to provide a theoretical convergence analysis for CHFL in strongly convex, general convex, and non-convex objectives. Our results demonstrate the convergence rate are $\tilde{\mathcal{O}}(1/MNRKT)$ for strongly convex objective, $\mathcal{O}(1/\sqrt{MNRKT})$ for general convex objective, and $\mathcal{O}(1/\sqrt{MNRKT})$ for non-convex objective, under standard assumptions. Here, $M$ is the number of edge servers, $N$ is the number of clients in edge, $K$ is local steps in client, and $R$ is the edge training round. Through extensive experiments on real-world datasets, besides validating our theoretical findings, we further show CHFL achieves a comparable or superior performance when accounting for both inter- and intra-edge data heterogeneity.