Keywords: Deep Learning Theory, Representation Convergence, Overfitting
TL;DR: This paper validates two interrelated propositions that demonstrate a strong connection between the representation convergence of interactions and a DNN’s overfitting.
Abstract: This paper explores whether the generalization power of DNNs can be understood through the generalizability of interactions between input variables they encode, which is one of the central challenges in the field of symbolic generalization. Specifically, we propose and validate the following two propositions. First, we validate that the representation convergence of interactions and the overfitting degree of a DNN are strongly negatively correlated. Second, we demonstrate that proactively enhancing interaction convergence can effectively mitigate overfitting. Our results show that, apart from our interaction-level consistency, other forms of representational consistency do not effectively mitigate a DNN’s overfitting. Furthermore, eliminating non-convergent interactions also successfully improves the proportion of interactions that generalize to testing samples.
Supplementary Material: zip
Primary Area: neurosymbolic & hybrid AI systems (physics-informed, logic & formal reasoning, etc.)
Submission Number: 3302
Loading