Keywords: Layer-wise Compression, Distributed Variational Inequality
TL;DR: This paper study adaptive layer-wise compression and optimistic dual averaging for distributed variational inequalities
Abstract: We develop a general layer-wise and adaptive quantization framework with with error and code-length guarantees and applications to solving large-scale distributed variational inequality problems. We also propose Quantized and Generalized Optimistic Dual Averaging (QODA) which achieves the optimal rate of convergence for distributed monotone VIs under absolute noise. We empirically show that the adaptive layer-wise quantization achieves up to a $47$% speedup in end-to-end training time for training Wasserstein GAN on $4$ GPUs.
Submission Number: 108
Loading