What-If Analysis of Large Language Models: Explore the Game World Using Proactive Thinking

Published: 02 Mar 2026, Last Modified: 04 Mar 2026ICLR 2026 Workshop AIMSEveryoneRevisionsCC BY 4.0
Keywords: Large Language Models, Reinforcement Learning, World Models, What-If Analysis, Proactive Thinking, Game AI, Strategic Reasoning, MOBA, Decision Making
TL;DR: This paper propose a new thinking paradigm (what-if-analysis) for LLMs to learn to proactive thinking in dynamic environment MOBA game.
Abstract: LLMs struggle with decision-making in high-stakes environments such as MOBA games, primarily due to limited proactive reasoning and an incomplete understanding of complex game dynamics. To address this, we propose What-if Analysis LLM (WiA-LLM), a framework that trains an LLM as an explicit, language-based world model. Instead of representing the environment in latent vectors, WiA-LLM uses natural language to simulate how the game state evolves over time in response to candidate actions and provides textual justifications for these predicted outcomes. WiA-LLM is trained in two stages: supervised fine-tuning on human-like reasoning traces, followed by reinforcement learning with outcome-based rewards that align predicted and actual future states. In the Honor of Kings (HoK) environment, WiA-LLM attains 74.2\% accuracy (27\%$\uparrow$ vs. the base model) in forecasting game-state changes. In addition, WiA-LLM demonstrates strategic behavior more closely aligned with expert players than purely reactive LLMs, indicating enhanced foresight and expert-like decision-making.
Track: Long Paper
Email Sharing: We authorize the sharing of all author emails with Program Chairs.
Data Release: We authorize the release of our submission and author names to the public in the event of acceptance.
Submission Number: 18
Loading