A Convergent Federated Clustering  Algorithm without Initial Condition

Harsh Vardhan; Avishek Ghosh; Arya Mazumdar

A Convergent Federated Clustering Algorithm without Initial Condition

Harsh Vardhan, Avishek Ghosh, Arya Mazumdar

24 Sept 2023 (modified: 25 Mar 2024)ICLR 2024 Conference Withdrawn SubmissionEveryoneRevisionsBibTeX

Keywords: Federated Learning, Heterogeneity, Clustering

TL;DR: We propose and analyze a theoretically convergent and empirically superior Federated Clustering Algorithm

Abstract:

Federated learning (FL) is a distributed learning paradigm that allows multiple users to collaboratively train a shared model without exchanging their data with a central server. However, optimal models of different users often differ due to heterogeneity of data across users. In this paper, we address the dichotomy between heterogeneous models and simultaneous training in FL via a clustering structure among the users. The clustering framework is one way to allow for high heterogeneity level between users, while users with similar data can still train a shared model. In this paper, we define a new clustering framework for FL based on the (optimal) local models of the users: two users belong to the same cluster if their local models are close. We propose an algorithm, Successive Refine Federated Clustering Algorithm (SR-FCA), that treats each user as a singleton cluster as an initialization, and then successively refine the cluster estimation via exploiting similarity with other users. In any intermediate step, SR-FCA uses an error-tolerant federated learning algorithm within each cluster to exploit simultaneous training and to correct clustering errors. Unlike some prominent prior works, SR-FCA does not require any good initialization (or warm start), both in theory and practice. We show that with proper choice of learning rate, SR-FCA incurs arbitrarily small clustering error. Additionally, SR-FCA does not require the knowledge of the number of clusters apriori like some prior works. We also validate the performance of our algorithm on real-world FL datasets including FEMNIST and Shakespeare in non-convex problems and show the benefits of SR-FCA over several baselines.

Supplementary Material: zip

Primary Area: optimization

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.

Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2024/AuthorGuide.

Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors' identity.

No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.

Submission Number: 9076

Loading