2018 (modified: 11 Nov 2022)ICML 2018Readers: Everyone
Abstract:Training large neural networks requires distributing learning across multiple workers, where the cost of communicating gradients can be a significant bottleneck. signSGD alleviates this problem by ...