Counterfactual Regret Minimization for Sequential Equilibrium

ICLR 2026 Conference Submission17332 Authors

19 Sept 2025 (modified: 08 Oct 2025)ICLR 2026 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Computer Games, sequential equilibrium, CFR, algorithmic game theory
Abstract: Computing the Nash equilibrium (NE) in the imperfect-information two-player zero-sum sequential game is an important problem. Finding some refinements of the Nash equilibrium is important because the Nash equilibrium may take sub-optimal actions in states that can not be reached in equilibrium. In this work, we improve the framework of the counterfactual regret minimization (CFR) algorithm, proving that our algorithm can converge to the refinements of the Nash equilibrium under some assumptions. The extensive-form perfect equilibrium (EFPE) and the sequential equilibrium (SE) are two refinements of the Nash equilibrium, they improve on this shortcoming of the Nash equilibrium by assuming that players make mistakes. Most current sequential equilibrium and extensive-form perfect equilibrium computing algorithms are not iterative algorithms and need to solve linear programs, which are ineffective on large-scale games. Our method gives a local perturbation in all the states in the game and gives a suitable perturbation descent method. We compare our Sequential Perturbed Counterfactual Regret Minimization (SPCFR) algorithm with CFR variants and the approximate EFPE computing algorithm, perturbed CFR. Experimental results show that our method outperforms existing CFR-based methods on popular games, including Kuhn Poker, Leduc Hold'em, and GoofSpiel.
Primary Area: reinforcement learning
Submission Number: 17332
Loading