Keywords: Generalization, Regularization, Training Method, Deep Learning, Inconsistency
Abstract: Accurately estimating the generalization gap and devising optimization methods that generalize better are crucial for deep learning models, particularly in both theoretical understanding and practical applications. The ability to leverage unlabeled data for these purposes offers significant advantages in real-world scenarios. This paper introduces a novel generalization measure, termed $\textit{local inconsistency}$, developed from an information-geometric perspective of the neural network's parameter space; a key feature is its computability from unlabeled data. We establish its theoretical underpinnings by connecting local inconsistency to the Fisher Information Matrix (FIM) and the loss Hessian. Empirically, we demonstrate that local inconsistency not only correlates with the generalization gap but also exhibits characteristics comparable to $\textit{sharpness}$. Based on these findings, we propose Inconsistency-Aware Minimization (IAM), a regularization strategy that incorporates local inconsistency. We demonstrate that in standard supervised learning settings, IAM enhances generalization, achieving performance comparable to existing methods such as Sharpness-Aware Minimization (SAM). Furthermore, IAM exhibits notable efficacy in semi-supervised learning scenarios, where the local inconsistency regularizer is computed from the unlabeled data portion to further improve model performance.
Supplementary Material: zip
Primary Area: Deep learning (e.g., architectures, generative models, optimization for deep networks, foundation models, LLMs)
Submission Number: 28941
Loading