Kernel Regression in Structured Non-IID Settings: Theory and Implications for Denoising Score Learning

Published: 18 Sept 2025, Last Modified: 29 Oct 2025NeurIPS 2025 posterEveryoneRevisionsBibTeXCC BY 4.0
Keywords: kernel ridge regression, generalization, sample complexity, diffusion model
TL;DR: We provide a refined analysis on the excess risk of kernel ridge regression in structured non-i.i.d. setting and apply it to denoising score learning.
Abstract: Kernel ridge regression (KRR) is a foundational tool in machine learning, with recent work emphasizing its connections to neural networks. However, existing theory primarily addresses the i.i.d. setting, while real-world data often exhibits structured dependencies - particularly in applications like denoising score learning where multiple noisy observations derive from shared underlying signals. We present the first systematic study of KRR generalization for non-i.i.d. data with signal-noise causal structure, where observations represent different noisy views of common signals. Under standard spectral decay assumptions, we develop a novel blockwise decomposition method that enables precise concentration analysis for dependent data. Using this technique, we derive excess risk bounds for KRR that explicitly depend on: (1) the kernel spectrum, (2) causal structure parameters, and (3) sampling mechanisms (including relative sample sizes for signals and noises). We further apply our results to denoising score learning, establishing generalization guarantees and providing principled guidance for sampling noisy data points. This work advances KRR theory while providing practical tools for analyzing dependent data in modern machine learning applications.
Primary Area: Theory (e.g., control theory, learning theory, algorithmic game theory)
Submission Number: 16429
Loading