Single Deep Counterfactual Regret Minimization

Eric Steinberger

Single Deep Counterfactual Regret Minimization

Eric Steinberger

25 Sept 2019 (modified: 23 Mar 2025)ICLR 2020 Conference Withdrawn SubmissionReaders: Everyone

Abstract: Counterfactual Regret Minimization (CFR) is the most successful algorithm for finding approximate Nash equilibria in imperfect information games. However, CFR's reliance on full game-tree traversals limits its scalability and generality. Therefore, the game's state- and action-space is often abstracted (i.e. simplified) for CFR, and the resulting strategy is then mapped back to the full game. This requires extensive expert-knowledge, is not practical in many games outside of poker, and often converges to highly exploitable policies. A recently proposed method, Deep CFR, applies deep learning directly to CFR, allowing the agent to intrinsically abstract and generalize over the state-space from samples, without requiring expert knowledge. In this paper, we introduce Single Deep CFR (SD-CFR), a variant of Deep CFR that has a lower overall approximation error by avoiding the training of an average strategy network. We show that SD-CFR is more attractive from a theoretical perspective and empirically outperforms Deep CFR with respect to exploitability and one-on-one play in poker.

Code: https://drive.google.com/file/d/18Vu07ewvaZyPBVyOwsKbTu1gH0R8Zlq0/view?usp=sharing

Keywords: Game Theory, Deep Reinforcement Learning, Counterfactual Regret Minimization, Imperfect Information Games, Games, Poker, Nash Equilibrium

TL;DR: Better Deep Reinforcement Learning algorithm to approximate Counterfactual Regret Minimization

Community Implementations: [![CatalyzeX](/images/catalyzex_icon.svg) 5 code implementations](https://www.catalyzex.com/paper/single-deep-counterfactual-regret/code)

Original Pdf: pdf

5 Replies

Loading