Mixture of Masters: Sparse Chess Language Models with Player Routing

Mixture of Masters: Sparse Chess Language Models with Player Routing

ICLR 2026 Conference Submission19919 Authors

19 Sept 2025 (modified: 08 Oct 2025)ICLR 2026 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: chess language modeling, mixture of experts, reinforcement learning, behavioral stylometry

TL;DR: Sparse Chess Language Models with Player Routing

Abstract: Modern chess language models are dense transformers trained on millions of games played by thousands of high-rated individuals. However, these monolithic networks tend to collapse into mode-averaged behavior, where stylistic boundaries are blurred, and rare but effective strategies are suppressed. To counteract homogenization, we introduce Mixture-of-Masters (MoM), the first chess mixture-of-experts model with small-sized GPT experts emulating world-class grandmasters. Each expert is trained with a combination of self-supervised learning and reinforcement learning guided by chess-specific rewards. For each move, a post-hoc learnable gating network selects the most appropriate persona to channel depending on the game state, allowing MoM to switch its style dynamically—e.g., Tal’s offensive vocation, Capablanca's positional dominance, or Petrosian’s defensive solidity. To quantitatively assess whether each expert captures a distinctive playing signature, we propose a behavioral stylometry metric by training a vision transformer encoder to classify grandmasters from game segments. When evaluated against Stockfish on unseen standard games, MoM outperforms both dense individual expert networks and popular GPT baselines trained on aggregated data.

Primary Area: foundation or frontier models, including LLMs

Submission Number: 19919

Loading