Enhancing Language Agent Strategic Reasoning through Self-Play in Adversarial Games

ACL ARR 2026 January Submission9602 Authors

06 Jan 2026 (modified: 20 Mar 2026)ACL ARR 2026 January SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: language agent, self play, strategic adversarial games
Abstract: Language agents often struggle with strategic reasoning in adversarial games. A promising approach is learning from game interactions automatically, but unlike static environments, selecting appropriate opponents in adversarial settings significantly impacts learning—a factor that remains underexplored. We propose **S**tep-level poli**C**y **O**ptimization through **P**lay-**A**nd-**L**earn (SCO-PAL), and conduct systematic analysis of opponent selection, finding that self-play is most effective for improving strategic reasoning. With SCO-PAL and self-play, we improve the average win rate from 32.17\% (base model) to 50.08\%, achieving 54.76\% against GPT-4 across six games. The learned skills also generalize to unseen games and broader reasoning tasks, demonstrating the unique advantages of LLM-based agents.
Paper Type: Long
Research Area: AI/LLM Agents
Research Area Keywords: language agent, self play, strategic adversarial games
Contribution Types: NLP engineering experiment
Languages Studied: English
Submission Number: 9602
Loading