Keywords: m-out-of-n bootstrap; sample quantiles; central limit theorem; Edgeworth expansion; Berry–Esseen bounds; L-statistics; robust inference
TL;DR: We establish the first parameter-free CLT for m-out-of-n bootstrap quantiles, sharpen it with Edgeworth bounds for finite samples, and show gains for different learning tasks.
Abstract: The m-out-of-n bootstrap—proposed by \cite{bickel1992resampling}—approximates the distribution of a statistic by repeatedly drawing $m$ subsamples ($m \ll n$) without replacement from an original sample of size n; it is now routinely used for robust inference with heavy-tailed data, bandwidth selection, and other large-sample applications. Despite this broad applicability across econometrics, biostatistics, and machine-learning workflows, rigorous parameter-free guarantees for the soundness of the m-out-of-n bootstrap when estimating sample quantiles have remained elusive.
This paper establishes such guarantees by analysing the estimator of sample quantiles obtained from m-out-of-n resampling of a dataset of length n. We first prove a central limit theorem for a fully data-driven version of the estimator that holds under a mild moment condition and involves no unknown nuisance parameters. We then show that the moment assumption is essentially tight by constructing a counter-example in which the CLT fails. Strengthening the assumptions slightly, we derive an Edgeworth expansion that delivers exact convergence rates and, as a corollary, a Berry–Esséen bound on the bootstrap approximation error. Finally, we illustrate the scope of our results by obtaining parameter-free asymptotic distributions for practical statistics, including the quantiles for random walk MH, and rewards of ergodic MDP's, thereby demonstrating the usefulness of our theory in modern estimation and learning tasks.
Primary Area: Probabilistic methods (e.g., variational inference, causal inference, Gaussian processes)
Submission Number: 17607
Loading