Federated Minimax Optimization with Client Heterogeneity

Published: 12 Dec 2023, Last Modified: 12 Dec 2023Accepted by TMLREveryoneRevisionsBibTeX
Abstract: Minimax optimization has seen a surge in interest with the advent of modern applications such as GANs, and it is inherently more challenging than simple minimization. The difficulty is exacerbated by the training data residing at multiple edge devices or \textit{clients}, especially when these clients can have heterogeneous datasets and heterogeneous local computation capabilities. We propose a general federated minimax optimization framework that subsumes such settings and several existing methods like Local SGDA. We show that naive aggregation of model updates made by clients running unequal number of local steps can result in optimizing a mismatched objective function -- a phenomenon previously observed in standard federated minimization. To fix this problem, we propose normalizing the client updates by the number of local steps. We analyze the convergence of the proposed algorithm for classes of nonconvex-concave and nonconvex-nonconcave functions and characterize the impact of heterogeneous client data, partial client participation, and heterogeneous local computations. For all the function classes considered, we significantly improve the existing computation and communication complexity results. Experimental results support our theoretical claims.
License: Creative Commons Attribution 4.0 International (CC BY 4.0)
Submission Length: Long submission (more than 12 pages of main content)
Changes Since Last Submission: Dear Editors, The following changes have been made based on the reviewers' suggestions/comments: Mathematical Details. - Fixed some typos in the paper. - Fixed the typo in Corollary 1.2 - Added discussion on the boundedness of y constraint set $\mathcal Y$, weak convexity of $\Phi$ in NC-C case. - Modified the defn of stochastic gradient complexity - In the theorem statements in the main paper, we have added pointers to where we mention the specific values of the learning rates in the Appendix Experiments. - We have added the parameter values in Appendix D - Added training loss figures (Fig 6, 7) for robust neural network training experiments in Appendix D
Assigned Action Editor: ~Simon_Lacoste-Julien1
Submission Number: 1128