Sublinearly Structured Deep Neural Networks Achieve Feature Learning Consistency for Compositional Functions
Abstract: Over the past decade, deep neural networks (DNNs) have achieved remarkable success on complex machine-learning tasks, yet the theoretical foundations of their performance remain incomplete. From a statistical viewpoint, a natural question is: can DNNs attain feature-learning and prediction consistency comparable to that of classical models? While a full characterization is open, we provide positive results for a broad subclass. We establish consistency guarantees for sublinearly structured DNNs—architectures whose input/output dimensions and number of hidden neurons grow sublinearly with the sample size—when learning features of hierarchically compositional target functions. Importantly, the consistency still holds even in the conventional ``over-parameterized'' regime where the total number of parameters exceeds the number of training samples. Empirically, sublinearly structured DNNs match or surpass wide DNNs in prediction. A structural audit further indicates that widely used convolutional neural networks (CNNs), including AlexNet, VGGNet, ResNet, GoogLeNet, are sublinearly structured on their image classification benchmarks. We further prove that the sublinearly structured DNNs achieve universal approximation for hierarchically compositional functions in the large-sample limit. Moreover, images exhibit an inherent hierarchical, compositional structure. Taken together, these results explain, through a statistical lens, why many large-scale deep learning models succeed after adequate training on massive image datasets.
Submission Type: Special issue on Statistics and AI
Submission Number: 2
Loading