Decentralized double stochastic averaging gradient

Aryan Mokhtari, Alejandro Ribeiro

2015 (modified: 04 Nov 2022)ACSSC 2015Readers: Everyone

Abstract: This paper considers convex optimization problems where nodes of a network have access to summands of a global objective function. Each of these local objectives is further assumed to be an average of a finite set of functions. The motivation for this setup is to solve large scale machine learning problems where elements of the training set are distributed to multiple computational elements. The decentralized double stochastic averaging gradient (DSA) algorithm is proposed as a solution alternative that relies on: (i) The use of local stochastic averaging gradients instead of local full gradients. (ii) Determination of descent steps as differences of consecutive stochastic averaging gradients. The algorithm is shown to approach the optimal argument at a linear rate. This is in contrast to all other available methods for distributed stochastic optimization that converge at sublinear rates. Numerical experiments verify linear convergence of DSA and illustrate its advantages relative to these other alternatives.

0 Replies