Characterizing Optimizer-Dependent Training Dynamics Through Hessian Eigenvector Displacement and Localization
Keywords: Hessian eigenvectors, Training Dynamics, Loss Landscape Geometry, Optimization Dynamics, Eigenvector Localization, SGD, Adam, Sharpness-Aware Minimization
Abstract: Hessian spectral properties are a standard tool in analysing neural-network training, with eigenvalues linked to sharpness, generalization, and optimization dynamics. Eigenvalues quantify curvature magnitude, while eigenvectors identify *which parameters* generate that curvature. In this work, we study how the leading Hessian *eigenvectors* evolve during training and how they affect the learning trajectories.
We track the training dynamics of multilayer perceptrons on a classification problem and measure eigenvector dynamics through two complementary statistics: (i) displacement over time, inspired by analyses of glassy systems (Baity-Jesi et al., 2018), and (ii) localization via the inverse participation ratio. The metrics are compared against a random null model of the Hessian induced by the architecture.
Our preliminary results reveal clear optimizer-dependent behaviour. SGD leads to progressively more stable leading curvature directions, while Adam exhibits substantially stronger reorganization of eigenvectors throughout training. We also observe a localization phenomenon under Adam, where a small subset of parameters contributes disproportionately to the leading curvature directions. These results suggest that Hessian eigenvector dynamics capture key differences in optimizer behaviour and the resulting training trajectories.
Email Sharing: We authorize the sharing of all author emails with Program Chairs.
Data Release: We authorize the release of our submission and author names to the public in the event of acceptance.
Submission Number: 61
Loading