Stability and Sharper Risk Bounds with Convergence Rate $\tilde{O}(1/n^2)$

Published: 18 Sept 2025, Last Modified: 29 Oct 2025NeurIPS 2025 posterEveryoneRevisionsBibTeXCC BY 4.0
Keywords: algorithmic stability, generalization bounds, excess risk bounds, stochastic gradiet descent
TL;DR: We derive high probability excess risk bounds to at most $\tilde{O}(1/n^2)$ for ERM, GD and SGD and our high probability results on the generalization error of gradients for nonconvex problems are also the sharpest.
Abstract: Prior work (Klochkov \& Zhivotovskiy, 2021) establishes at most $O\left(\log (n)/n\right)$ excess risk bounds via algorithmic stability for strongly-convex learners with high probability. We show that under the similar common assumptions — Polyak-Lojasiewicz condition, smoothness, and Lipschitz continous for losses — rates of $O\left(\log^2(n)/n^2\right)$ are at most achievable. To our knowledge, our analysis also provides the tightest high-probability bounds for gradient-based generalization gaps in nonconvex settings.
Primary Area: Theory (e.g., control theory, learning theory, algorithmic game theory)
Submission Number: 23199
Loading