A Spectral Bound on Effective Sharpness for Fisher- Preconditioned Gradient Descent

TMLR Paper8132 Authors

27 Mar 2026 (modified: 17 Apr 2026)Under review for TMLREveryoneRevisionsBibTeXCC BY 4.0
Abstract: An explicit stability characterization of effective sharpness $\lambda_{\max}(F^{-1}H)$ under Fisher preconditioning is provided, decomposing stability into residual curvature and model misspecification components. When the Gauss-Newton matrix $G$ equals the Fisher $F$ (the correctly specified negative log-likelihood setting), it is shown that the effective sharpness satisfies $S_{\text{eff}} \leq 1 + \epsilon/\mu_{\min}(F)$, where $\epsilon = \|H - G\|_2$ is the spectral norm of the residual curvature and $\mu_{\min}(F)$ is the minimum eigenvalue of the Fisher Information Matrix. When $G \neq F$, a relaxed bound $S_{\text{eff}} \leq 1 + (\epsilon + \delta)/\mu_{\min}(F)$ is established, with $\delta = \|G - F\|_2$ measuring model misspecification, thereby separating the two sources of curvature error. An alignment-aware Rayleigh quotient analysis reveals that the worst-case bound is loose by 1.3--7.1$\times$ due to favorable alignment between the residual curvature $Q$ and the Fisher eigenvectors. Experiments on deep linear networks (55--3,240 parameters, 5 seeds per configuration) verify the general misspecification-aware bound at all tested scales. On a 110-parameter deep linear network where all quantities are computed exactly, the idealized bound is confirmed to hold when $G \approx F$ but is violated when model misspecification is substantial, while the general bound correctly holds at all measured iterations. The experimental range is too limited to draw conclusions about scaling behavior. K-FAC at CIFAR-10 ResNet-18 scale (11.2M parameters) achieves 90.5\% test accuracy (vs. SGD 86.2\%), operating in regions of 41$\times$ higher raw Hessian sharpness while converging stably, consistent with the spectral flattening mechanism, though direct measurement of $S_{\text{eff}}$ at this scale remains intractable.
Submission Type: Long submission (more than 12 pages of main content)
Assigned Action Editor: ~Matthew_J._Holland1
Submission Number: 8132
Loading