Dual-Scale World Models for LLM Agents towards Hard-Exploration Problems

Dual-Scale World Models for LLM Agents towards Hard-Exploration Problems

ICLR 2026 Conference Submission15093 Authors

19 Sept 2025 (modified: 08 Oct 2025)ICLR 2026 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: hard-exploration problems, world model, llm agents, text-based games

TL;DR: We introduce GLoW, a framework for LLM agents leveraging dual-scale world models for hard-exploration problems.

Abstract: LLM-based agents have seen promising advances, yet they are still limited in “hard-exploration” tasks requiring learning new knowledge through exploration. We present GLoW, a novel approach leveraging dual-scale world models, maintaining a trajectory frontier of high-value discoveries at the global scale, while learning from local trial-and-error in exploration through a Multi-path Advantage Reflection mechanism which infers advantage-based progress signals to guide exploration. To evaluate our framework for hard-exploration, we tackle the Jericho benchmark suite of text-based games, where GLoW achieves a new state-of-the-art performance for LLM-based approaches. Compared to state-of-the-art RL-based methods, our approach achieves comparable performance while requiring 100-800× fewer environment interactions.

Primary Area: applications to robotics, autonomy, planning

Submission Number: 15093

Loading