On Federated Learning of Deep Networks from Non-IID Data: Parameter Divergence and the Effects of Hyperparametric Methods

Heejae Kim; Taewoo Kim; Chan-Hyun Youn

On Federated Learning of Deep Networks from Non-IID Data: Parameter Divergence and the Effects of Hyperparametric Methods

Heejae Kim, Taewoo Kim, Chan-Hyun Youn

25 Sept 2019 (modified: 05 May 2023)ICLR 2020 Conference Blind SubmissionReaders: Everyone

TL;DR: We investigate the internal reasons of our observations, the diminishing effects of the well-known hyperparameter optimization methods on federated learning from decentralized non-IID data.

Abstract: Federated learning, where a global model is trained by iterative parameter averaging of locally-computed updates, is a promising approach for distributed training of deep networks; it provides high communication-efficiency and privacy-preservability, which allows to fit well into decentralized data environments, e.g., mobile-cloud ecosystems. However, despite the advantages, the federated learning-based methods still have a challenge in dealing with non-IID training data of local devices (i.e., learners). In this regard, we study the effects of a variety of hyperparametric conditions under the non-IID environments, to answer important concerns in practical implementations: (i) We first investigate parameter divergence of local updates to explain performance degradation from non-IID data. The origin of the parameter divergence is also found both empirically and theoretically. (ii) We then revisit the effects of optimizers, network depth/width, and regularization techniques; our observations show that the well-known advantages of the hyperparameter optimization strategies could rather yield diminishing returns with non-IID data. (iii) We finally provide the reasons of the failure cases in a categorized way, mainly based on metrics of the parameter divergence.

Code: https://github.com/fl-noniid/fl-noniid

Keywords: Federated learning, Iterative parameter averaging, Deep networks, Decentralized non-IID data, Hyperparameter optimization methods

Original Pdf: pdf

20 Replies

Loading