Private and Personalized Histogram Estimation in a Federated Setting

Published: 28 Oct 2023, Last Modified: 22 Nov 2023FL@FM-NeurIPS’23 PosterEveryoneRevisionsBibTeX
Student Author Indication: Yes
Keywords: Federated learning, personalization, histogram estimation, privacy
Abstract: Personalized federated learning (PFL) aims at learning personalized models for users in a federated setup. We focus on the problem of privately estimating histograms (in the KL metric) for each user in the network. Conventionally, for more general problems learning a global model jointly via federated averaging, and then finetuning locally for each user has been a winning strategy. But this can be suboptimal if the user distribution observes diverse subpopulations, as one might expect with user vocabularies. To tackle this, we study an alternative PFL technique: clustering based personalization that first identifies diverse subpopulations when present, enabling users to collaborate more closely with others from the same subpopulation. We motivate our algorithm via a stylized generative process: mixture of Dirichlets, and propose initialization/pre-processing techniques that reduce the iteration complexity of clustering. This enables the application of privacy mechanisms at each step of our iterative procedure, making the algorithm user-level differentially private without severe drop in utility due to added noise. Finally, we present empirical results on Reddit users data where we compare our method with other well-known PFL approaches applied to private histogram estimation.
Submission Number: 58