Finite-sample Guarantees for Nash Q-learning with Linear Function Approximation

Pedro Cisneros-Velarde; Oluwasanmi O Koyejo

Finite-sample Guarantees for Nash Q-learning with Linear Function Approximation

Pedro Cisneros-Velarde, Oluwasanmi O Koyejo

Published: 08 May 2023, Last Modified: 26 Jun 2023UAI 2023Readers: Everyone

Keywords: reinforcement learning, linear function approximation, Nash, Q-learning, Markov game

Abstract: Nash Q-learning may be considered one of the first and most known algorithms in multi-agent reinforcement learning (MARL) for learning policies that constitute a Nash equilibrium of an underlying general-sum Markov game. Its original proof was in the asymptotic domain and for the tabular case. Recently, finite-sample guarantees have been provided using more modern RL techniques for the tabular case. Our work analyzes Nash Q-learning using linear function approximation – a representation regime introduced when the state space is large or continuous – and provides finite-sample guarantees that indicate its sample efficiency. We find that the obtained performance nearly matches an existing efficient result for single-agent RL under the same representation and has a polynomial gap when compared to the best-known result for the tabular case.

Supplementary Material: pdf

Other Supplementary Material: zip

0 Replies

Loading