Charting Flat Minima Using the Conserved Quantities of Gradient FlowDownload PDF

26 Sept 2022, 12:09 (modified: 09 Nov 2022, 02:12)NeurReps 2022 PosterReaders: Everyone
Keywords: symmetry, gradient flow, conserved quantity, Lie group, Lie algebra
TL;DR: Continuous symmetries in architectures lead to conserved quantities in gradient flow, which parametrize extended minima.
Abstract: Empirical studies have revealed that many minima in the loss landscape of deep learning are connected and reside on a low-loss valley. We present a general framework for finding continuous symmetries in the parameter space, which give rise to the low-loss valleys. We introduce a novel set of nonlinear, data-dependent symmetries for neural networks. We then show that conserved quantities associated with linear symmetries can be used to define coordinates along the minima. The distribution of conserved quantities reveals that using common initialization methods, gradient flow only explores a small part of the global minimum. By relating conserved quantities to convergence rate and sharpness of the minimum, we provide insights on how initialization impacts convergence and generalizability. We also find the nonlinear action to be viable for ensemble building to improve robustness under certain adversarial attacks.
4 Replies