Exact Rates and Saturation Effect of Kernel Ridge Regression over Unbounded Input Space in Large Dimensions
Keywords: reproducing kernel Hilbert space, high-dimensional statistics, kernel ridge regression, saturation effect, generalization error
Abstract: This work presents a theoretical analysis of Kernel Ridge Regression (KRR) in a large dimensional regime where both the input dimension $d$ and sample size $n$ grow, satisfying $n \asymp d^\gamma$ for some $\gamma > 0$. We extend prior studies, which focus on inner product kernels on the sphere $\mathbb{S}^{d-1}$, to broader classes of kernels defined on unbounded domains $\mathcal{X} \subseteq \mathbb{R}^d$. Suppose that the true function $f_{\rho}^* \in [\mathcal{H}]^s$, where $[\mathcal{H}]^s$ denotes the interpolation space of RKHS $\mathcal{H}$ with source condition $s>0$. Our primary contribution is a precise characterization of the generalization error of KRR, treating the cases $s \ge 1$ and $s < 1$ separately.
Surprisingly, after adopting the result to the Gaussian kernel in large dimensions and deriving precise asymptotics for the corresponding eigenvalues, our analysis of Gaussian kernel ridge regression reveals that it:
$i)$ achieves minimax optimality when $0 < s \leq 1$;
$ii)$ fails to attain the minimax lower bound for $s > 1$, demonstrating a $\textit{saturation effect}$ where additional smoothness beyond this point does not improve the convergence rate.
Furthermore, we identify two phenomena unique to the large dimensional regime: a $\textit{periodic plateau phenomenon}$ in the convergence rate and a $\textit {multiple-descent behavior}$ with respect to the sample size $n$.
Primary Area: learning theory
Submission Number: 11561
Loading