Anderson Accelerated Asynchronous Method for Distributed Optimization

03 May 2026 (modified: 06 May 2026)Under review for TMLREveryoneRevisionsBibTeXCC BY 4.0
Abstract: Anderson acceleration (AA) is an effective technique for accelerating fixed-point iterations, but it is rarely applied to distributed optimization or distributed machine learning. In this paper, we apply AA to accelerate an asynchronous distributed gradient method over the master-worker architecture, resulting in the Asynchronous Distributed Gradient Method with Anderson Acceleration (ADGM-AA). In particular, we first transform the asynchronous gradient method into a fixed-point iteration, and then incorporate it with AA. To ensure the global convergence of ADGM-AA, we equip it with a novel reference-path-based safe-guard scheme. We prove that under mild conditions, ADGM-AA converges with fixed step-sizes that are independent of the delays. Compared with the delay-dependent step-size in most existing works, our delay-free step-size is easier to determine and often leads to faster convergence. To emphasize, numerical experiments demonstrate that by incorporating the AA scheme, the proposed ADGM-AA significantly outperforms the vanilla asynchronous distributed gradient method.
Submission Type: Regular submission (no more than 12 pages of main content)
Assigned Action Editor: ~Franck_Iutzeler1
Submission Number: 8738
Loading