Generalization Bounds for Autoregressive Processes and In-Context Learning

Oğuz Kaan Yüksel; Nicolas Flammarion

Generalization Bounds for Autoregressive Processes and In-Context Learning

Oğuz Kaan Yüksel, Nicolas Flammarion

Published: 31 Oct 2025, Last Modified: 28 Nov 2025EurIPS 2025 Workshop PriGMEveryoneRevisionsBibTeXCC BY 4.0

Keywords: in-context learning, autoregressive processes, generalization, non-i.i.d. learning theory

Abstract: In this paper, we derive generalization results for next-token risk minimization in autoregressive processes of unbounded order. Our starting point is to relate the empirical loss to the denoising loss, which requires no additional assumptions compared to fixed-order Markovian models. We then show that, under a mixing or rephrasability condition on the data-generating process and assuming a stable hypothesis class, the out-of-sample generalization error concentrates around the denoising error. These results characterize sample complexity in terms of the number of tokens, rather than the number of i.i.d. sequences. As a primary application, we interpret in-context learning as a special case of autoregressive prediction and derive sample complexity bounds under similar conditions. Importantly, the properties of individual in-context tasks determine the generalization rates, without requiring assumptions on mixture processes. This perspective suggests that in-context learning can exploit the task decomposition to learn efficiently.

Submission Number: 40

Loading