Continuous Propagation: Layer-Parallel Training

Michael James; Devansh Arpit; Herman Sahota; Ilya Sharapov

Continuous Propagation: Layer-Parallel Training

Michael James, Devansh Arpit, Herman Sahota, Ilya Sharapov

19 Jan 2018 (modified: 25 Jan 2018)ICLR 2018 Conference Withdrawn SubmissionReaders: Everyone

Abstract: Continuous propagation is a parallel technique for training deep neural networks with batch size one at full utilization of a multiprocessor system. It enables spatially distributed computations on emerging deep learning hardware accelerators that do not impose programming limitations of contemporary GPUs. The algorithm achieves model parallelism along the depth of a deep network. The method is based on the continuous representation of the optimization process and enables sustained gradient generation during all phases of computation. We demonstrate that in addition to its increased concurrency, continuous propagation improves the convergence rate of state of the art methods while matching their accuracy.

Keywords: Deep Learning, Model parallelism, Learning theory

7 Replies

Loading