2018 (modified: 17 May 2023)AISTATS 2018Readers: Everyone
Abstract:It has been experimentally observed that distributed implementations of mini-batch stochastic gradient descent (SGD) algorithms exhibit speedup saturation and decaying generalization ability beyond...