A Comprehensive Study on Personalized Federated Learning with Non-IID Data

Published: 01 Jan 2022, Last Modified: 13 Nov 2024ISPA/BDCloud/SocialCom/SustainCom 2022EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Personalized Federated Learning (PFL) has been extensively studied to overcome the Non-IID challenges of FL. However, given realistic and non-pathologically Non-IID distributions, the elaborate personalized models can even be inferior to a naive single global model. In this paper, we take the first step toward further understanding when PFL is better on a common Non-IID distribution: label skewed distributions. We take Cllustered Federated Learning (CFL) as an example and demonstrate that it is non-trivial to decide whether PFL outperforms a single global model on a given dataset in advance. We propose Hybrid Clustered Federated Learning (HCFL) to tackle the above challenge by alternately aggregating and personalizing the cluster models. HCFL can obtain the advantage of personalized models while preserving the generalization capability. Besides, we propose an efficient clustering algorithm, combining K-Means and Bayesian Information Criterion to ensure the performance superiority of HCFL. Extensive experiments show that HCFL can exhibit superior performance on both slight and pathological label skewed distributions, and outperform state-of-the-art algorithms by up to 10.6% with negligible introduced overhead. Moreover, the performance of HCFL does not rely on the well-tuned aggregation hyper-parameters and can generalize well to new clients, and is thus more suitable for deployment in real FL systems.
Loading