Polynomial Convergence of Bandit No-Regret Dynamics in Congestion Games

Leello Tadesse Dadi; Ioannis Panageas; Stratis Skoulakis; Luca Viano; Volkan Cevher

Polynomial Convergence of Bandit No-Regret Dynamics in Congestion Games

Leello Tadesse Dadi, Ioannis Panageas, Stratis Skoulakis, Luca Viano, Volkan Cevher

Published: 18 Jun 2024, Last Modified: 16 Jul 2024Agentic Markets @ ICML'24 PosterEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Congestion games, Nash Equilibrium, No-regret, polynomial time, bandit feedback

Abstract: We present an online learning algorithm in the bandit feedback model that, once adopted by all agents of a congestion game, results in game-dynamics that converge to an $\epsilon$-approximate Nash Equilibrium in a polynomial number of rounds with respect to $1/\epsilon$, the number of players and the number of available resources. The proposed algorithm also guarantees sublinear regret to any agent adopting it. As a result, our work answers an open question from Cui et al (2022) and extends the recent results of Panageas et al (2023) to the bandit feedback model. Our algorithm can be implemented in *polynomial time* for the important special case of Network Congestion Games on Directed Acyclic Graphs (DAG) as barycentric spanners can efficiently be constructed in this case. We complete our work by further proposing a natural, exact, $1$-barycentric spanner construction for DAGs.

Submission Number: 5

Loading