Poker Arena: Multi-Axis Profiling of Strategic Reasoning and Memory in LLMs

Pratham Singla; Shivank Garg; VIHAN SINGH

Poker Arena: Multi-Axis Profiling of Strategic Reasoning and Memory in LLMs

Pratham Singla, Shivank Garg, VIHAN SINGH

Published: 07 Jun 2026, Last Modified: 10 Jun 2026ICML 2026 WorkshopEveryoneRevisionsBibTeXCC BY 4.0

Keywords: No-Limit Texas Hold'em, Game-Playing Agents, Strategic Reasoning, Cognitive Profiing, Multi-Axis Evaluation

Abstract: Strategic reasoning under uncertainty underpins consequential decisions in negotiation, finance, and policy, but prevailing game-play benchmarks collapse heterogeneous reasoning dimensions into a single scalar, leaving the capability structure of frontier LLMs unexamined. We introduce *Poker Arena*, a no-limit Texas Hold'em tournament platform that couples a three-layer memory architecture (within-hand, session, and cross-session) with a nine-axis cognitive profile decomposing strategic reasoning into interpretable dimensions such as bet-sizing calibration and positional awareness. We evaluate seven frontier models across 50 sessions of 1,000 hands and a controlled memory ablation; tournament chips and aggregate axis score order the field differently: Claude Opus 4.6 wins +\$15,730 chips with 14 first-place finishes, yet ranks only fifth of seven on mean axis score, while persistent memory helps some models and hurts others. These findings show that multi-axis evaluation surfaces capability structure that scalar leaderboards systematically misrank, with cross-dimensional consistency outweighing peak performance on any single axis.

Email Sharing: We authorize the sharing of all author emails with Program Chairs.

Data Release: We authorize the release of our submission and author names to the public in the event of acceptance.

Paper Type: Standard paper

Submission Number: 62

Loading