Autoregressive Learning under Joint KL Analysis: Horizon-Free Approximation and Computational-Statistical Tradeoffs

Published: 28 Nov 2025, Last Modified: 30 Nov 2025NeurIPS 2025 Workshop MLxOREveryoneRevisionsBibTeXCC BY 4.0
Keywords: autoregressive modeling, generalization bound, improper learning, computational-statistical tradeoff
Abstract: We study autoregressive generative modeling under misspecification measured by the joint Kullback–Leibler (KL) divergence. For approximation, we show that joint KL admits a horizon-free barrier independent of the sequence length $H$, unlike prior Hellinger-based analyses that imply an $\Omega(H)$ dependence. For estimation, we prove a finite-sample lower bound showing that any proper learner, including empirical risk minimization, suffers $\Omega(H^2)$ error. We then propose an improper Bayesian posterior learner that leverages a lifted policy space for computational efficiency, achieving horizon-free approximation and an $O(H)$ estimation rate. Our results identify divergence choice as the source of horizon dependence in approximation and establish a genuine computational-statistical tradeoff for estimation, motivating new algorithmic designs.
Submission Number: 213
Loading