DADAM: A consensus-based distributed adaptive gradient method for online optimization

Parvin Nazari; Davoud Ataee Tarzanagh; George Michailidis

DADAM: A consensus-based distributed adaptive gradient method for online optimization

Parvin Nazari, Davoud Ataee Tarzanagh, George Michailidis

27 Sept 2018 (modified: 22 Jun 2025)ICLR 2019 Conference Blind SubmissionReaders: Everyone

Abstract: Online and stochastic optimization methods such as SGD, ADAGRAD and ADAM are key algorithms in solving large-scale machine learning problems including deep learning. A number of schemes that are based on communications of nodes with a central server have been recently proposed in the literature to parallelize them. A bottleneck of such centralized algorithms lies on the high communication cost incurred by the central node. In this paper, we present a new consensus-based distributed adaptive moment estimation method (DADAM) for online optimization over a decentralized network that enables data parallelization, as well as decentralized computation. Such a framework note only can be extremely useful for learning agents with access to only local data in a communication constrained environment, but as shown in this work also outperform centralized adaptive algorithms such as ADAM for certain realistic classes of loss functions. We analyze the convergence properties of the proposed algorithm and provide a \textit{dynamic regret} bound on the convergence rate of adaptive moment estimation methods in both stochastic and deterministic settings. Empirical results demonstrate that DADAM works well in practice and compares favorably to competing online optimization methods.

Code: [![github](/images/github_icon.svg) Tarzanagh/DADAM](https://github.com/Tarzanagh/DADAM)

Community Implementations: [![CatalyzeX](/images/catalyzex_icon.svg) 2 code implementations](https://www.catalyzex.com/paper/dadam-a-consensus-based-distributed-adaptive/code)

17 Replies

Loading