Out-of-Distribution Study of Rule-Based and Strategic Reasoning in Chess Transformers
Track: long paper (up to 10 pages)
Keywords: Language model, reasoning, transformer, chess, out-of-distribution generalization, compositional generalization, rule extrapolation, chess960
Abstract: Modern decision transformers, trained similarly to LLMs, can achieve strong in-
distribution performance in complex sequential domains like chess, but it remains
unclear to what extent they reason systematically about rules and strategy. We
study the reasoning capabilities of a 270M-parameter chess transformer trained
via behavior cloning on standard chess. To investigate its abilities, we construct
out-of-distribution test sets —including board states and variants never seen dur-
ing training—designed to reveal failures of systematic generalization. Our analy-
sis shows that the model exhibits robust rule-based reasoning, consistently gener-
ating legal moves in novel configurations, but its strategic reasoning is more lim-
ited. The model generates high-quality moves on curated OOD puzzles and shows
basic strategy adaptation in full games. It underperforms symbolic AI algorithms
that rely on explicit search, although the performance gap is smaller when playing
against human users on Lichess. Moreover, the training dynamics reveals distinct
phases in how the model learns to respect the fundamental constraints, suggesting
an emergent compositional understanding of the game.
Presenter: ~Patrik_Reizinger1
Format: Yes, the presenting author will definitely attend in person because they attending ICLR for other complementary reasons.
Funding: No, the presenting author of this submission does *not* fall under ICLR’s funding aims, or has sufficient alternate funding.
Anonymization: This submission has been anonymized for double-blind review via the removal of identifying information such as names, affiliations, and identifying URLs.
Submission Number: 48
Loading