Abstract: We consider a partially personalized formulation of Federated Learning (FL) that strikes a balance between the flexibility of personalization and cooperativeness of global training. In our framework, we split the variables into global parameters, which are shared across all clients, and individual local parameters, which are kept private. We prove that under the right split of parameters, it is possible to find global parameters that allow each client to fit their data perfectly, and refer to the obtained problem as overpersonalized. For instance, the shared global parameters can be used to learn good data representations, whereas the personalized layers are fine-tuned for a specific client. Moreover, we present a simple algorithm for the partially personalized formulation that offers significant benefits to all clients. In particular, it breaks the curse of data heterogeneity in several settings, such as training with local steps, asynchronous training, and Byzantine-robust training.
Submission Length: Regular submission (no more than 12 pages of main content)
Changes Since Last Submission:
We have made the following changes:
- We added new experimental results for the impact of data heterogeneity on the convergence of FFGG, and we also added Local Training (personalization only) and FedSim as requested by Reviewer XKnK and Reviewer cUMT.
- We made small changes throughout the text to improve readability. Following upon the comments by Reviewer cUMT, we restructured subsections 1.3-1.5 into a single subsection with a more explicit motivation.
- We added a section with limitations to address the concern of Reviewer XKnK, as well as a section with conclusions following the advice of Reviewer 8Vd9. We also added a detailed comparison to the work of Pillutla et al. as suggested by Reviewer 8Vd9.
Assigned Action Editor: Virginia Smith
Submission Number: 2613
Loading