Preference-CFR: Beyond Nash Equilibrium for Better Game Strategies

Qi Ju; Thomas Tellier; Meng Sun; Zhemei Fang; YunFeng Luo

Preference-CFR: Beyond Nash Equilibrium for Better Game Strategies

Qi Ju, Thomas Tellier, Meng Sun, Zhemei Fang, YunFeng Luo

Published: 01 May 2025, Last Modified: 23 Jul 2025ICML 2025 posterEveryoneRevisionsBibTeXCC BY 4.0

TL;DR: This paper presents a novel algorithm, Preference-based Counterfactual Regret Minimization (PrefCFR), which enables convergence to diverse equilibria in imperfect-information games.

Abstract: Artificial intelligence (AI) has surpassed top human players in a variety of games. In imperfect information games, these achievements have primarily been driven by Counterfactual Regret Minimization (CFR) and its variants for computing Nash equilibrium. However, most existing research has focused on maximizing payoff, while largely neglecting the importance of strategic diversity and the need for varied play styles, thereby limiting AI’s adaptability to different user preferences. To address this gap, we propose Preference-CFR (Pref-CFR), a novel method that incorporates two key parameters: preference degree and vulnerability degree. These parameters enable the AI to adjust its strategic distribution within an acceptable performance loss threshold, thereby enhancing its adaptability to a wider range of strategic demands. In our experiments with Texas Hold’em, Pref-CFR successfully trained Aggressive and Loose Passive styles that not only match original CFR-based strategies in performance but also display clearly distinct behavioral patterns. Notably, for certain hand scenarios, Pref-CFR produces strategies that diverge significantly from both conventional expert heuristics and original CFR outputs, potentially offering novel insights for professional players.

Lay Summary: A central challenge in game theory is solving for equilibria, a problem that has been extensively studied. Researchers have achieved remarkable success in solving complex games such as Go, Texas Hold'em, and StarCraft. However, existing work primarily focuses on finding a single equilibrium strategy, overlooking the need to generate diverse playing styles that align with human preferences. Building upon the Counterfactual Regret Minimization (CFR) framework, we introduce a novel variant, Preference-CFR (Pref-CFR), which incorporates two additional parameters: preference intensity and vulnerability threshold. This approach ensures the generation of AI strategies that balance stylistic diversity with acceptable utility loss, thereby meeting user-specified playstyle requirements.

Link To Code: https://github.com/Zealoter/PrefCFR

Primary Area: Theory->Game Theory

Keywords: Nash Equilibrium, CFVFP, customized AI, Game

Submission Number: 1033

Loading