Global CFR: Meta-Learning in Self-Play Regret Minimization

Published: 26 Oct 2023, Last Modified: 13 Dec 2023NeurIPS 2023 Workshop PosterEveryoneRevisionsBibTeX
Keywords: regret minimization, meta-learning
TL;DR: We use meta-learning to speed up self-play regret minimization.
Abstract: In real-world situations, players often encounter a distribution of similar but distinct games, like poker games with different public cards or trading varied correlated stock market assets. While these games exhibit related equilibria, current literature mainly delves into single games or their repeated versions. (Sychrovsky et al. 2023) recently introduced offline meta-learning to accelerate equilibrium discovery for such distributions in a single-player online setting. We build upon this, extending to a two-player zero-sum self-play setting. Our method uniquely integrates information for next strategy selection for both players across all decision states, promoting global communication as opposed to the traditional local regret decomposition. Evaluations on distributions of matrix and sequential games reveal our meta-learned algorithms surpass their non-meta-learned variants.
Submission Number: 45
Loading