Interpreting Emergent Military Tactics in a General AlphaZero Framework

Koen Boeckx; Xavier Neyt

Interpreting Emergent Military Tactics in a General AlphaZero Framework

Koen Boeckx, Xavier Neyt

19 Sept 2025 (modified: 26 Jan 2026)ICLR 2026 Conference Withdrawn SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: reinforcement learning, alphazero, military

TL;DR: We use AlphaZero to discover military tactics for battlefield scenario's

Abstract: This paper presents an approach that combines AlphaZero with convolutional and transformer-based neural network architectures to learn strategies in battlefield-inspired gridworld games. These games are designed to balance realism with rapid outcomes, featuring multiple agents organized into competing teams. To encourage effective coordination among agents, we investigate different reward shaping methods and evaluate their impact on emergent teamwork. The learned strategies are analyzed on a tactical level, in an attempt to reveal insights into multi-agent collaboration and competitive behavior. In particular, the framework provides a testbed for studying how military-style strategies can emerge from self-play. Through a series of comparative studies, we further break down the contributions of architectural components and training methodologies to demonstrate the effectiveness of this approach for decision-making in dynamic adversarial settings.

Supplementary Material: zip

Primary Area: reinforcement learning

Submission Number: 18455

Loading