Low-Rank Tensorization Improves KAN Sample Complexity

07 May 2026 (modified: 09 May 2026)ICML 2026 Workshop CoLoRAI SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Kolmogorov-Arnold Networks (KANs), Low-rank tensorization (CP), Sample complexity, PAC generalization bounds, Rademacher complexity
TL;DR: We prove that low-rank (CP) tensorizing KAN spline coefficients tightens PAC generalization bounds, improving sample efficiency and reducing grid overfitting under mild assumptions.
Abstract: Kolmogorov-Arnold Networks (KANs) replace scalar edge weights by learnable B-spline functions and thereby enlarge the local approximation mechanism, but this flexibility creates a parameter bottleneck whose leading term scales linearly in the grid size $G$ for every input-output edge. Standard capacity estimates for unconstrained KAN layers consequently give sample-complexity terms proportional to $d_{\mathrm{in}}d_{\mathrm{out}}G$, making high-dimensional KANs statistically vulnerable unless the sample size grows with the full edge-grid tensor. This paper introduces Tensor-Decomposed Low-Rank KANs (LR-KANs), in which the three-way B-spline coefficient tensor $\mathcal{W}\in\mathbb{R}^{d_{\mathrm{in}}\times d_{\mathrm{out}}\times G}$ is constrained to have rank-$r$ CANDECOMP/PARAFAC form. Under explicit bounded-factor and spline-regularity assumptions, we prove PAC generalization bounds by controlling Lipschitz constants, covering the CP parameter manifold, and applying Dudley's entropy integral. The resulting dominant sample-complexity dimension decreases from $d_{\mathrm{in}}d_{\mathrm{out}}G$ to $r(d_{\mathrm{in}}+d_{\mathrm{out}}+G)$, giving a formal statistical explanation for tensorized KAN layers in regimes where grid refinement would otherwise overfit.
Submission Number: 62
Loading