SQMG: An Optimized Stochastic Quantization Method Using Multivariate Gaussians for Distributed Learning
Abstract: Distributed Learning is pivotal for training extensive deep neural networks across multiple nodes, leveraging parallel computation to hasten the learning process. However, it faces challenges in communication efficiency and resource utilization. Asynchronous Quantized Stochastic Gradient Descent (AQSGD) addresses communication bottlenecks by updating quantized model parameters, thereby expediting training and reducing bandwidth usage. Yet, current stochastic quantization methods may inadequately capture varied gradient distributions, leading to accumulated biases and amplified quantization errors. These issues are amplified as the number of distributed nodes grows. This study proposes a novel Stochastic Quantization with Multivariate Gaussians (SQMG) for distributed machine learning. SQMG employs a multivariate Gaussian model to represent the relationships in the gradient updates for quantization. The SQMG approach allows for constructing an optimized quantization target space, coupled with an iterative mapping scheme that effectively projects the parameters onto this space while minimizing quantization errors. Experiments on DNN and CNN models for MNIST and CIFAR-10 show that SQMG increases accuracy by 0.92% and 1.54% for DNN and CNN models, respectively, compared to conventional quantization methods. The results validate SQMG’s ability to reduce quantization errors and improve model accuracy in distributed learning systems.
Loading