On the Fundamental Limits of Overparameterized Basis Expansion Machine Learning Models

On the Fundamental Limits of Overparameterized Basis Expansion Machine Learning Models

TMLR Paper5790 Authors

01 Sept 2025 (modified: 22 Sept 2025)Under review for TMLREveryoneRevisionsBibTeXCC BY 4.0

Abstract: We study the generalization behavior of over-parameterized fixed basis regression models, which subsumes random feature models, extreme learning machines and are a special ver- sion of adaptive basis regression models like feed-forward neural networks. We distinguish between strict generalization, which requires recovery of the true target structure through re- covery of the true coefficients in basis expansion of it, and weak generalization, which requires the minimization of test error alone. To characterize these, we introduce the sampling and expressivity thresholds, which complement the well-known interpolation threshold, which compares the training data size with model complexity. Our analysis shows that strict gen- eralization which enables out-of-domain approximation i.e extrapolation, are unattainable in over-parameterized regimes, while weak generalization remains feasible for in-domain tasks. Moreover, using Bernstein bases and the Weierstrass Approximation Theorem, we further prove that weak generalization is theoretically always achievable for closed and bounded continuous one-dimensional functions within the training domain, a result re-emphasized from approximation theory. We also study condition number of feature matrix and reveal insights into choice of basis of the model vs stability. Our work refines the understanding of generalization in over-parameterized learning and connects classical approximation theory with modern machine learning. Finally, we discuss applications for deep neural networks and quantum machine learning. While limited to one-dimensional continuous functions with fixed bases, this analysis offers simple and refined insights into the fundamental trade-offs for over-parameterized models beyond comparison of model complexity and sample size.

Submission Length: Long submission (more than 12 pages of main content)

Changes Since Last Submission: Removed a GitHub link for anonymity, in order to comply with the double-blind review policy. No other changes to the manuscript content.

Assigned Action Editor: ~Russell_Tsuchida1

Submission Number: 5790

Loading