Federated Learning with Heterogeneous Differential Privacy

TMLR Paper756 Authors

06 Jan 2023 (modified: 28 Feb 2023)Withdrawn by AuthorsEveryoneRevisionsBibTeX
Abstract: Federated learning (FL) takes a first step towards privacy-preserving machine learning by training models while keeping client data local. Models trained using FL may still indirectly leak private client information through model updates during training. Differential privacy (DP) may be employed on model updates to provide privacy guarantees within FL, typically at the cost of degraded performance of the final trained model. Both non-private FL and DP-FL can be solved using variants of the federated averaging (\textsc{FedAvg}) algorithm. In this work, we consider a heterogeneous DP setup where clients require varying degrees of privacy guarantees. First, we analyze the optimal solution to the federated linear regression problem with \emph{heterogeneous} DP in a Bayesian setup. We find that unlike the non-private setup, where the optimal solution for homogeneous data amounts to a single global solution for all clients learned through \textsc{FedAvg}, the optimal solution for each client in this setup would be a personalized one even for homogeneous data. We also analyze the privacy-utility trade-off for this problem, where we characterize the gain obtained from the heterogeneous privacy where some clients opt for less stringent privacy guarantees. We propose a new algorithm for FL with heterogeneous DP, referred to as \textsc{FedHDP}, which employs personalization and weighted averaging at the server using the privacy choices of clients, to achieve better performance on clients' local models. Through numerical experiments, we show that \textsc{FedHDP} provides up to $9.27\%$ performance gain compared to the baseline DP-FL for the considered datasets where $5\%$ of clients opt out of DP. Additionally, we show a gap in the average performance of local models between non-private and private clients of up to $3.49\%$, empirically illustrating that the baseline DP-FL might incur a large utility cost when not all clients require the stricter privacy guarantees.
Submission Length: Regular submission (no more than 12 pages of main content)
Previous TMLR Submission Url: https://openreview.net/forum?id=mANbinHZuw
Changes Since Last Submission: Dear Prof. Kangwook Lee,\ First, we would like to thank you for giving us the opportunity to revise our paper, and would like to thank you and the reviewers again for the constructive feedback we received during the last round. We hope the revised paper addresses the main issues that were identified in the last round. We have updated the manuscript to reflect the suggested revisions. In particular, we have revised the entirety of the paper with a focus on keeping the flow of ideas seamless, as well as clarifying certain statements and assumptions. The following contains a summary of the revisions of made according to our previous response: 1. **Writing:** We have revised the paper heavily to address the concerns around writing quality. The latest version of the paper is now less than 12 pages, and has considerably better readability and consistency. Specifically, we revised the introduction part to be more concise and clear, especially the discussions around related works and their relationship to our work. Furthermore, Section 2 was revised to ensure ideas are clarified in a systematic manner for the reader. Additionally, We heavily reworked Section 3 and replaced the federated point estimation with the federated linear regression setup. We made sure to include only the important details and the insightful results in this part. We left out any details that are not necessary for the reader to understand the conceptual ideas behind the algorithm in this setup and moved them to the appendix. Moreover, additional revisions were done in the remainder of the paper, as well as the appendices. 2. **Hyperparameter $r$:** As we promised, we have clarified the assumption made regarding the hyperparameter tuning of $r$. We include statements in the remark in Section 5 and the limitations appendix to reflect this assumption. 3. **DP Algorithms utilizing public data as baselines:** In the introduction, we have cited the papers in this line of work and stated the limitation they face in FL setups. We sincerely thank you and the reviewers again for your time handling our paper. Thanks,\ Authors
Assigned Action Editor: ~Kangwook_Lee1
Submission Number: 756
Loading