Towards the Decisive Factor of Symbolic Generalization of DNNs

Junpeng Zhang; Lei Cheng; Qihan Ren; Qing Li; Liang Lin; Quanshi Zhang

Towards the Decisive Factor of Symbolic Generalization of DNNs

Junpeng Zhang, Lei Cheng, Qihan Ren, Qing Li, Liang Lin, Quanshi Zhang

16 Sept 2025 (modified: 11 Feb 2026)Submitted to ICLR 2026EveryoneRevisionsBibTeXCC BY 4.0

Keywords: Model Generalization, Overfitting, Deep Learning Theory

TL;DR: This study proves that it is the randomness of parameter initialization in the low layers of a DNN that determines the composition of its confusing samples.

Abstract: The decisive factor that drives deep neural networks (DNNs) to learn non-generalizable representations (*i.e.,* non-generalizable interactions between input variables) has been a persistent challenge in the field of symbolic generalization. In this paper, we quantify the generalization power of such interactions encoded by DNNs, and we discover that DNNs usually learn non-generalizable interactions from a few samples, referred to as *confusing samples*. The emergence of confusing samples during the training process explains the overfitting of a DNN. We further discover that the composition of confusing samples is determined by the randomness of parameter initialization in the low layers of a DNN. In contrast, other factors, such as high-layer parameters and network architecture, have much less impact on the composition of confusing samples. Consequently, two DNNs initialized with different low-layer parameters will eventually learn entirely different sets of confusing samples, even though they have similar performance.

Supplementary Material: zip

Primary Area: learning theory

Submission Number: 6746

Loading