Understanding Deep Learning Requires Rethinking Sharpness

Published: 09 Jun 2025, Last Modified: 09 Jun 2025HiLD at ICML 2025 PosterEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Loss Landscape Geometry, Robustness, Calibration, Functional Similarity, Safety
TL;DR: We explore the hypothesis that robust, calibrated and functionally similar models sit at flatter minima; however, we find that models with these properties generally exist at sharper minima compared to baselines.
Abstract: The geometric flatness of neural network minima has long been associated with desirable generalisation properties. In this paper, we extensively explore the hypothesis that robust, calibrated and functionally similar models sit at flatter minima, inline with prevailing understandings of the relationship between flatness and generalisation. Contrary to common assertions in the literature, we find a relationship between increased sharpness, generalisaton, calibration and robustness in neural networks across architectures when using Sharpness Aware Minimisation, augmentation and weight decay as regulariser controls. Our findings suggest that the role of increased sharpness should be considered independently for individual models when reasoning about the geometric properties of neural networks. We show that sharpness can be related to generalisation and safety-relevant properties against the flatter minima found without the use of our regularisation controls. Understanding these properties calls for a re-thinking of the role of sharpness in geometric landscapes.
Student Paper: Yes
Submission Number: 112
Loading