Enhancing Low-Precision Sampling via Stochastic Gradient Hamiltonian Monte Carlo

Published: 01 Nov 2023, Last Modified: 22 Dec 2023MLNCP PosterEveryoneRevisionsBibTeX
Keywords: SGMCMC, Low-precision Sampling, SGHMC, Bayesian Deep Learning.
TL;DR: We provide the first comprehensive investigation for low-precision SGHMC, and demonstrate its advantges over low-precision SGLD.
Abstract: Low-precision training has emerged as a promising low-cost technique to enhance the training efficiency of deep neural networks without sacrificing much accuracy. Its Bayesian counterpart can further provide uncertainty quantification and improved generalization accuracy. This paper investigates low-precision samplers via Stochastics Gradient Hamiltonian Monte Carlo (SGHMC) with low-precision and full-precision gradients accumulators for both strongly log-concave and non-log-concave distributions. Theoretically, our results show that, to achieve $\epsilon$-error in the 2-Wasserstein distance for non-log-concave distributions, low-precision SGHMC achieves quadratic improvement ($\tilde{\mathcal{O}}\left({\epsilon^{-2}{\mu^*}^{-2}\log^2\left({\epsilon^{-1}}\right)}\right)$) compared to the state-of-the-art low-precision sampler, Stochastic Gradient Langevin Dynamics (SGLD) ($\tilde{\mathcal{O}}\left({{\epsilon}^{-4}{\lambda^{*}}^{-1}\log^5\left({\epsilon^{-1}}\right)}\right)$). Moreover, we prove that low-precision SGHMC is more robust to the quantization error compared to low-precision SGLD due to the robustness of the momentum-based update w.r.t. gradient noise. Empirically, we conduct experiments on synthetic and MNIST, CIFAR-10 \& CIFAR-100 datasets which successfully validate our theoretical findings. Our study highlights the potential of low-precision SGHMC as an efficient and accurate sampling method for large-scale and resource-limited deep learning.
Submission Number: 34