Out-of-Distribution Study of Rule-Based and Strategic Reasoning in Chess Transformers

Published: 05 Mar 2026, Last Modified: 05 Mar 2026ICLR 2026 Workshop LLM ReasoningEveryoneRevisionsBibTeXCC BY 4.0
Track: long paper (up to 10 pages)
Keywords: Language model, reasoning, transformer, chess, out-of-distribution generalization, compositional generalization, rule extrapolation, chess960
Abstract: Modern decision transformers, trained similarly to LLMs, can achieve strong in- distribution performance in complex sequential domains like chess, but it remains unclear to what extent they reason systematically about rules and strategy. We study the reasoning capabilities of a 270M-parameter chess transformer trained via behavior cloning on standard chess. To investigate its abilities, we construct out-of-distribution test sets —including board states and variants never seen dur- ing training—designed to reveal failures of systematic generalization. Our analy- sis shows that the model exhibits robust rule-based reasoning, consistently gener- ating legal moves in novel configurations, but its strategic reasoning is more lim- ited. The model generates high-quality moves on curated OOD puzzles and shows basic strategy adaptation in full games. It underperforms symbolic AI algorithms that rely on explicit search, although the performance gap is smaller when playing against human users on Lichess. Moreover, the training dynamics reveals distinct phases in how the model learns to respect the fundamental constraints, suggesting an emergent compositional understanding of the game.
Presenter: ~Patrik_Reizinger1
Format: Yes, the presenting author will definitely attend in person because they attending ICLR for other complementary reasons.
Funding: No, the presenting author of this submission does *not* fall under ICLR’s funding aims, or has sufficient alternate funding.
Anonymization: This submission has been anonymized for double-blind review via the removal of identifying information such as names, affiliations, and identifying URLs.
Submission Number: 48
Loading