Stochastic Rounding Implicitly Regularizes Tall-and-Thin Matrices

Published: 06 Dec 2024, Last Modified: 05 May 2026SIAM Journal on Matrix Analysis and ApplicationsEveryoneCC BY 4.0
Abstract: Motivated by the popularity of stochastic rounding in the context of machine learning and the training of large-scale deep neural network models, we consider stochastic nearness rounding of real matrices $\mathbf{A}$ with many more rows than columns. We provide novel theoretical evidence, supported by extensive experimental evaluation, that with high probability, the smallest singular value of a stochastically rounded matrix is well bounded away from zero—regardless of how close $\mathbf{A}$ is to being rank-deficient and even if $\mathbf{A}$ is rank-deficient. In other words, stochastic rounding implicitly regularizes tall-and-thin matrices $\mathbf{A}$ so that the rounded version has full column rank. Our proofs leverage powerful results in random matrix theory, and the idea that stochastic rounding errors do not concentrate in low-dimensional column spaces.
Loading