Unlocking the Potential of Federated Learning for Deeper Models

Haolin Wang; Xuefeng Liu; Jianwei Niu; Shaojie Tang; Jiaxing Shen

Unlocking the Potential of Federated Learning for Deeper Models

Haolin Wang, Xuefeng Liu, Jianwei Niu, Shaojie Tang, Jiaxing Shen

23 Sept 2023 (modified: 25 Mar 2024)ICLR 2024 Conference Withdrawn SubmissionEveryoneRevisionsBibTeX

Keywords: Federated Learning, Distributed Model Optimization

TL;DR: We find the performance of Federated Learning declines by a large scale when using deeper models, and for addressing this issue, we investigate the causes and propose potential solutions.

Abstract:

Federated learning (FL) is a new paradigm for distributed machine learning that allows a global model to be trained across multiple clients without compromising their privacy. Although FL has demonstrated remarkable success in various scenarios, recent studies mainly utilize shallow and small neural networks. In our research, we discover a significant performance decline when applying the existing FL framework to deeper neural networks, even when client data are independently and identically distributed. Our further investigation shows that the decline is due to the continuous accumulation of dissimilarities among client models during the layer-by-layer back-propagation process, which we refer to as "divergence accumulation." As deeper models involve a longer chain of divergence accumulation, they tend to exhibit more significant divergence, subsequently leading to performance decline. Both theoretical derivations and empirical evidence are proposed to support the existence of divergence accumulation and its amplified effects in deeper models. To tackle this challenge, we propose a set of technical guidelines centered on minimizing divergence. These guidelines, consisting of strategies such as employing wider models and reducing the receptive field, greatly improve the performance of FL on deeper models. Their effectiveness is validated via extensive evaluation with various metrics. For example, applying the guidelines can boost the performance of ResNet101 on the Tiny-ImageNet dataset by as much as 43%.

Primary Area: general machine learning (i.e., none of the above)

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.

Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2024/AuthorGuide.

Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors' identity.

No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.

Submission Number: 6994

Loading