Keywords: Federated learning, communication-efficient training, error feedback, sparsification, non-IID data, partial participation, local SGD, non-convex convergence
TL;DR: We propose SA-PEF, a step-ahead partial error-feedback method for federated learning that mitigates early gradient mismatch on non-IID data, maintains late-stage stability, and provably converges under biased compression.
Abstract: Biased gradient compression with error feedback (EF) reduces communication in federated learning (FL), but under non-IID data the residual error can decay slowly, causing gradient mismatch and stalled progress in the early rounds. We propose step-ahead partial error feedback (SA-PEF), which integrates step-ahead (SA) correction with partial error feedback (PEF). SA-PEF recovers EF when the step-ahead coefficient $\alpha=0$ and step-ahead EF when $\alpha=1$. For non-convex objectives and $\delta$-contractive compressors, we establish a second-moment bound and a residual recursion that together guarantee convergence to stationarity under heterogeneous data and partial client participation. The resulting rates match standard non-convex Fed-SGD guarantees up to constant factors, achieving $O((\eta\\,\eta_0TR)^{-1})$ convergence to a variance/heterogeneity floor with fixed inner step size. Our analysis reveals a step-ahead-controlled residual contraction $\rho_r$ that explains the observed acceleration in the early training phase. Guided by our trade-off analysis, we select a fixed step-ahead weight $\alpha$ near the theory-predicted optimum that balances SAEF’s rapid warm-up with EF’s long-term stability. Experiments across diverse architectures and datasets show that SA-PEF consistently attains target accuracies faster than EF.
Supplementary Material: zip
Primary Area: optimization
Submission Number: 19185
Loading