Enhancing generalizability of deep networks via Fisher regularization

Enhancing generalizability of deep networks via Fisher regularization

TMLR Paper4820 Authors

10 May 2025 (modified: 01 Aug 2025)Rejected by TMLREveryoneRevisionsBibTeXCC BY 4.0

Abstract: The generalization ability of a deep learning classifier hinges significantly on the geometry of its loss landscape. Solutions residing near flatter areas are more robust, generalizing better than the ones present near sharp minima. In this paper, we study the effects of the loss landscape on the generalization of deep learning models and effectively leverage its geometric information to propose a novel regularization method, Fisher regularization. By dynamically penalizing weights based on their curvature across the loss landscape, we propose an adaptive regularization scheme that guides the optimization process towards flatter and more generalizable solutions. We establish a rigorous theoretical foundation for our regularization approach using the PAC-Bayesian theory and empirically validate the superior performance of deep learning models trained with our proposed method over other powerful regularization techniques across a range of challenging image classification benchmarks.

Submission Length: Regular submission (no more than 12 pages of main content)

Previous TMLR Submission Url: https://openreview.net/forum?id=46Jc6WnEFC

Changes Since Last Submission: In response to the action editor's comments, I have revised the manuscript as follows to address the identified issues: 1) Paragraph spacing and margins: Now there is a larger margin between all the paragraphs, making the distinction between the paragraphs clear. 2) Figure 1 improvement: The resolution and the font size of the first Figure have been enhanced, making it more appealing and clear to the readers. 3) Equation Punctuation: Removed all unnecessary punctuation from within equations and ensured no trailing punctuation (e.g., commas) follows any displayed equation.

Assigned Action Editor: ~Zhihui_Zhu1

Submission Number: 4820

Loading