DoubleSqueeze: Parallel Stochastic Gradient Descent with Double-pass Error-Compensated Compression

Hanlin Tang, Chen Yu, Xiangru Lian, Tong Zhang, Ji Liu

2019 (modified: 11 Nov 2022)ICML 2019Readers: Everyone

Abstract: A standard approach in large scale machine learning is distributed stochastic gradient training, which requires the computation of aggregated stochastic gradients over multiple nodes on a network. ...

0 Replies