The Equivalence of Dynamic and Strategic Stability under Regularized Learning in Games

Victor Boone; Panayotis Mertikopoulos

The Equivalence of Dynamic and Strategic Stability under Regularized Learning in Games

Victor Boone, Panayotis Mertikopoulos

Published: 21 Sept 2023, Last Modified: 02 Nov 2023NeurIPS 2023 spotlightEveryoneRevisionsBibTeX

Keywords: Regularized learning, dynamic stability, strategic stability, Nash equilibrium

Abstract: In this paper, we examine the long-run behavior of regularized, no-regret learning in finite N-player games. A well-known result in the field states that the empirical frequencies of play under no-regret learning converge to the game’s set of coarse correlated equilibria; however, our understanding of how the players' _actual strategies_ evolve over time is much more limited – and, in many cases, non-existent. This issue is exacerbated further by a series of recent results showing that _only_ strict Nash equilibria are stable and attracting under regularized learning, thus making the relation between learning and _pointwise_ solution concepts particularly elusive. In lieu of this, we take a more general approach and instead seek to characterize the _setwise_ rationality properties of the players' day-to-day trajectory of play. To do so, we focus on one of the most stringent criteria of setwise strategic stability, namely that any unilateral deviation from the set in question incurs a cost to the deviator – a property known as _closedness under better replies_ (club). In so doing, we obtain a remarkable equivalence between strategic and dynamic stability: _a product of pure strategies is closed under better replies if and only if its span is stable and attracting under regularized learning._ In addition, we estimate the rate of convergence to such sets, and we show that methods based on entropic regularization (like the exponential weights algorithm) converge at a geometric rate, while projection-based methods converge within a finite number of iterations, even with bandit, payoff-based feedback.

Supplementary Material: pdf

Submission Number: 12442

Loading