TALES: Text Adventure Learning Environment Suite

Christopher Zhang Cui; Xingdi Yuan; Ziang Xiao; Prithviraj Ammanabrolu; Marc-Alexandre Côté

TALES: Text Adventure Learning Environment Suite

Christopher Zhang Cui, Xingdi Yuan, Ziang Xiao, Prithviraj Ammanabrolu, Marc-Alexandre Côté

19 Sept 2025 (modified: 11 Feb 2026)Submitted to ICLR 2026EveryoneRevisionsBibTeXCC BY 4.0

Keywords: Benchmark, text-adventure games, LLM Agents

TL;DR: Unified text-adventure game benchmark with qualitative analysis of top performing game logs

Abstract: Reasoning is an essential skill to enable Large Language Models (LLMs) to interact with the world. As tasks become more complex, they demand increasingly sophisticated and diverse reasoning capabilities for sequential decision-making, requiring structured reasoning over the context history to determine the next best action. We introduce TALES, a diverse collection of synthetic and human-written text-adventure games designed to challenge and evaluate diverse reasoning capabilities. We present results over a range of LLMs, open- and closed-weights, performing a qualitative analysis on the top performing models. Despite an impressive showing on synthetic games, even the top LLM-driven agents fail to achieve 20% on games designed for human enjoyment. Visualization of the experiments can be found at https://github.com/tale-suite/tale-suite-anonymized.

Supplementary Material: zip

Primary Area: datasets and benchmarks

Submission Number: 20662

Loading