Rate optimal learning of equilibria from data

Till Freihaut; Luca Viano; Emanuele Nevali; Volkan Cevher; Matthieu Geist; Giorgia Ramponi

Rate optimal learning of equilibria from data

Till Freihaut, Luca Viano, Emanuele Nevali, Volkan Cevher, Matthieu Geist, Giorgia Ramponi

Published: 03 Feb 2026, Last Modified: 02 May 2026AISTATS 2026 PosterEveryoneRevisionsBibTeXCC BY 4.0

Abstract: We close open theoretical gaps in Multi-Agent Imitation Learning (MAIL) by characterizing the limits of non-interactive MAIL and presenting the first interactive algorithm with near-optimal sample complexity. In the non-interactive setting, we prove a statistical lower bound that identifies the \emph{all-policy deviation concentrability coefficient} as the fundamental complexity measure, and we show that Behavior Cloning (BC) is rate-optimal. For the interactive setting, we introduce a framework that combines reward-free reinforcement learning with interactive MAIL and instantiate it with an algorithm, \emph{\ours}. It improves the best previously known sample complexity from $\mathcal{O}(\varepsilon^{-8})$ to $\mathcal{O}(\varepsilon^{-2}),$ matching the dependence on $\varepsilon$ implied by our lower bound. Finally, we provide numerical results that support our theory and illustrate, in environments such as grid worlds, cases where Behavior Cloning fails to learn.

Code Dataset Promise: Yes

Code Dataset Url: https://github.com/tfreihaut/MAIL_WARM.git

Signed Copyright Form: pdf

Format Confirmation: I agree that I have read and followed the formatting instructions for the camera ready version.

Submission Number: 1486

Loading