Keywords: Bayesian Deep Learning, Low Precision Deep Learning, Multiplicative Weight Update
Abstract: We design a new algorithm for stable low-precision deep learning motivated by the robustness of biological neural networks. It is well known that the stationary distribution of spine sizes follows a log-normal distribution and arises from noisy multiplicative dynamics. Building on these synaptic fluctuations that underlie neural computation, we propose the Log-normal Multiplicative Dynamics (LMD) algorithm for stable learning under low-precision computation. The method is derived by using variational training with a log-normal posterior distribution over the weights. LMD is a multiplicative weights update method that overcomes the scalability challenges seen in other multiplicative updates. We show empirically that LMD can learn stably under low-precision matrix multiplications during forward passes. It also gives accurate results for training-from-scratch for Vision Transformer and GPT-2 scale architectures. These results suggest that multiplicative dynamics, a biological feature, can maintain performance under low-precision computation, a promising direction for future energy-efficient hardware.
Primary Area: optimization
Submission Number: 3072
Loading