Abstract: Compositional optimization (CO) has recently gained popularity due to its applications in many machine learning applications. The large-scale and distributed nature of data necessitates efficient federated learning (FL) algorithms for CO, but the compositional structure of the objective poses significant challenges. Current methods either rely on large batch gradients (which are impractical), require expensive computations, or suffer from suboptimal guarantees. To address these challenges, we propose efficient FedAvg-type algorithms for solving non-convex CO in the FL setting. We first theoretically establish that standard FedAvg fails in solving the federated CO problems due to data heterogeneity, which amplifies bias in local gradient estimates. Our analysis shows that controlling this bias necessarily requires either {\em additional communication} or {\em additional structural assumptions}. To this end, we develop two algorithms for solving the federated CO problem. First, we propose FedDRO that utilizes the compositional problem structure to design a communication strategy that allows FedAvg to converge. FedDRO achieves a sample complexity of $\mathcal{O}(\epsilon^{-2})$ and a communication complexity of $\mathcal{O}(\epsilon^{-3/2})$ when the inner compositional objective is low-dimensional. When the inner objective is high-dimensional, the communication complexity increases to $\mathcal{O}(\epsilon^{-2})$, while the sample complexity remains $\mathcal{O}(\epsilon^{-2})$. Then we propose DS-FedDRO, a two-sided learning rate algorithm that leverages an additional assumption to improve upon the communication complexity of FedDRO. DS-FedDRO achieves the optimal $\mathcal{O}(\epsilon^{-2})$ sample and $\mathcal{O}(\epsilon^{-1})$ communication complexity irrespective of the dimensionality of the inner compositional objective. We corroborate our theoretical findings with empirical studies on large-scale CO problems.
Submission Type: Regular submission (no more than 12 pages of main content)
Changes Since Last Submission: We thank the Action Editor for handling our submission and for the careful consideration of our manuscript. In the revised version, we have made two main improvements compared with the previous version:
1. We have updated the figures to ensure uniform formatting throughout the manuscript.
2. We have expanded the discussion of the heterogeneity assumptions and their justification.
Supplementary Material: zip
Assigned Action Editor: ~Sebastian_U_Stich1
Submission Number: 6993
Loading