FLAG: Clustered Federated Learning Combining Data and Gradient Information in Heterogeneous Settings

Anik Pramanik; Murat Kantarcioglu; Vincent Oria; Shantanu Sharma

FLAG: Clustered Federated Learning Combining Data and Gradient Information in Heterogeneous Settings

Anik Pramanik, Murat Kantarcioglu, Vincent Oria, Shantanu Sharma

27 Sept 2024 (modified: 05 Feb 2025)Submitted to ICLR 2025EveryoneRevisionsBibTeXCC BY 4.0

Keywords: Federated Learning, Clustering, Distributed Machine Learning

TL;DR: We introduce FLAG, a clustered FL approach that tackles data heterogeneity by using a weighted class-wise similarity metric combining data and gradient similarity for improved client clustering.

Abstract: Federated Learning (FL) emerged as an important tool to enable a group of agents/clients to collaboratively train a model without sharing their individual data with each other or any third party, instead exchanging only model updates during each training round. Although FL performs effectively when clients' data are homogeneous (e.g., each client's data is distributed i.i.d.), data heterogeneity among clients presents a major challenge, often leading to significant performance degradation. To address this challenge, a variety of approaches have been proposed. One particularly effective approach is clustered FL, where similar clients are grouped together to train separate models. Previous clustered FL approaches tend to rely solely on either data similarity or gradient similarity to cluster clients. This results in an incomplete assessment of client similarities, particularly when the datasets display various types of distributional skews, such as label, feature, or quantity imbalances. Consequently, these methods fail to capture the full spectrum of client heterogeneity, leading to suboptimal model performance across diverse client environments. In this work, we address the challenge of data heterogeneity in FL by introducing a novel clustered FL approach, called Flag. Flag employs a weighted class-wise similarity metric that integrates both data and gradient similarity, providing a more holistic measure of client similarity. This enables more accurate clustering of clients, ultimately improving model performance across heterogeneous data distributions. Our extensive empirical evaluation on multiple benchmark datasets, under various heterogeneous data scenarios, demonstrates that Flag consistently outperforms state-of-the-art approaches in terms of accuracy.

Primary Area: other topics in machine learning (i.e., none of the above)

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.

Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.

Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.

No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.

Submission Number: 10993

Loading