DanZero+: Dominating the GuanDan Game Through Reinforcement Learning

Youpeng Zhao; Yudong Lu; Jian Zhao; Wengang Zhou; Houqiang Li

DanZero+: Dominating the GuanDan Game Through Reinforcement Learning

Youpeng Zhao, Yudong Lu, Jian Zhao, Wengang Zhou, Houqiang Li

Published: 01 Jan 2024, Last Modified: 07 Mar 2025IEEE Trans. Games 2024EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Recent advancements have propelled artificial intelligence (AI) to showcase expertise in intricate card games, such as Mahjong, DouDizhu, and Texas Hold'em. In this work, we aim to develop an AI program for an exceptionally complex and popular card game called GuanDan. This game involves four players engaging in both competitive and cooperative play throughout a long process, posing great challenges for AI due to its expansive state and action space, long episode length, and complex rules. Employing reinforcement learning techniques, specifically deep Monte Carlo, and a distributed training framework, we first put forward an AI program named DanZero. Evaluation against baseline AI programs based on heuristic rules highlights the outstanding performance of our bot. Besides, in order to further enhance the AI's capabilities, we apply proximal policy optimization to GuanDan on the basis of Danzero. To address the challenges arising from the huge action space, which will significantly impact the performance of policy-based algorithms, we adopt the pretrained model to compress the action space and integrate action features into the model to bolster its generalization capabilities. Using these techniques, we manage to obtain a new GuanDan AI program DanZero+, which achieves a superior performance compared to DanZero.

Loading