Towards the Three-Phase Dynamics of Generalization Power of a DNN

ICLR 2026 Conference Submission15907 Authors

19 Sept 2025 (modified: 08 Oct 2025)ICLR 2026 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Generalization Analysis, Learning Dynamics, Deep Learning Theory
TL;DR: This paper explores a distinct dynamics of generalization power of a DNN.
Abstract: This paper addresses the core challenge in the field of symbolic generalization, i.e., how to define, quantify, and track the dynamics of generalizable and non-generalizable interactions encoded by a DNN throughout the training process. Specifically, this work builds upon the recent theoretical achievement in explainable AI, which proves that the detailed inference patterns of DNNs can be strictly rewritten as a small number of AND-OR interaction patterns. Based on this, we propose an efficient method to quantify the generalization power of each interaction, and we discover a distinct three-phase dynamics of the generalization power of interactions during training. In particular, the early phase of training typically removes noisy and non-generalizable interactions and learns simple and generalizable interactions. The second and the third phases tend to capture increasingly complex interactions that are harder to generalize. Experimental results verify that the learning of non-generalizable interactions is the direct cause for the gap between the training and testing losses.
Supplementary Material: zip
Primary Area: learning theory
Submission Number: 15907
Loading