On Convergence of FedProx: Local Dissimilarity Invariant Bounds, Non-smoothness and BeyondDownload PDF

Published: 31 Oct 2022, Last Modified: 28 Dec 2022NeurIPS 2022 AcceptReaders: Everyone
Keywords: Federated learning, FedProx, Minibatch stochastic proximal point methods, Uniform stability, Non-convex optimization, Non-smooth optimization
Abstract: The \FedProx~algorithm is a simple yet powerful distributed proximal point optimization method widely used for federated learning (FL) over heterogeneous data. Despite its popularity and remarkable success witnessed in practice, the theoretical understanding of FedProx is largely underinvestigated: the appealing convergence behavior of \FedProx~is so far characterized under certain non-standard and unrealistic dissimilarity assumptions of local functions, and the results are limited to smooth optimization problems. In order to remedy these deficiencies, we develop a novel local dissimilarity invariant convergence theory for \FedProx~and its minibatch stochastic extension through the lens of algorithmic stability. As a result, we contribute to derive several new and deeper insights into \FedProx~for non-convex federated optimization including: 1) convergence guarantees invariant to certain stringent local dissimilarity conditions; 2) convergence guarantees for non-smooth FL problems; and 3) linear speedup with respect to size of minibatch and number of sampled devices. Our theory for the first time reveals that local dissimilarity and smoothness are not must-have for \FedProx~to get favorable complexity bounds.
TL;DR: We contribute to derive several new and deeper theoretical insights into the FedProx algorithm under milder conditions
Supplementary Material: pdf
15 Replies