TL;DR: We provide generalisation error bound under heavy-tailed assumptions.
Abstract: The generalization error of a supervised statistical learning algorithm, defined as the difference between the population risk and the empirical risk, quantifies its ability to predict performance on previously unseen data. In this work, we analyze the generalization error under the heavy-tailed assumption on the loss function with respect to the data-generating distribution. Specifically, we derive uniform, information-theoretic, and PAC-Bayesian bounds on the generalization error under the assumption that the $(1+\epsilon)$-th moment of the loss function is bounded for $\epsilon\in(0,1]$. The generalization error is shown to have a convergence rate of $O(n^{-\epsilon/(1+\epsilon)})$ where $n$ is the number of training samples. Furthermore, we apply our results to study the generalization error of the Gibbs posterior and noisy iterative learning algorithms under the heavy-tailed assumption.
Code Dataset Promise: Yes
Signed Copyright Form: pdf
Format Confirmation: I agree that I have read and followed the formatting instructions for the camera ready version.
Submission Number: 2278
Loading