Keywords: Optimization, Mode Connectivity, Generalization, Entropy, Curvature, Flatness, Sharpness
TL;DR: We show that even when minima may be connected by a path of low loss, such paths often exhibit a dynamical barrier produced by entropic forces.
Abstract: Modern neural networks exhibit a striking property: solutions at the bottom of the loss landscape are often connected by low-loss paths, yet optimization dynamics remain confined to one solution and rarely explore intermediate points. We resolve this paradox by identifying entropic barriers arising from the interplay between curvature variations along these paths and noise in optimization dynamics. Empirically, we find that curvature systematically rises away from minima, producing effective forces that bias noisy dynamics back toward the endpoints — even when the loss remains nearly flat. These barriers persist longer than energetic barriers, shaping the late-time localization of solutions in parameter space. Moreover, entropic confinement biases optimization away from poorly generalizing minima, helping to explain why such basins remain inaccessible despite their low training loss. Our results highlight the role of curvature-induced entropic forces in governing both connectivity and confinement in deep learning landscapes.
Primary Area: optimization
Submission Number: 20280
Loading