Exploration Based Language Learning for Text-Based Games

Andrea Madotto; Mahdi Namazifar; Joost Huizinga; Piero Molino; Adrien Ecoffet; Huaixiu Zheng; Alexandros Papangelis; Dian Yu; Chandra Khatri; Gokhan Tur

Exploration Based Language Learning for Text-Based Games

Andrea Madotto, Mahdi Namazifar, Joost Huizinga, Piero Molino, Adrien Ecoffet, Huaixiu Zheng, Alexandros Papangelis, Dian Yu, Chandra Khatri, Gokhan Tur

25 Sept 2019 (modified: 22 Jun 2025)ICLR 2020 Conference Blind SubmissionReaders: Everyone

Keywords: Text-Based Games, Exploration, Language Learning

TL;DR: This work presents an exploration and imitation-learning-based agent capable of state-of-the-art performance in playing text-based computer games.

Abstract: This work presents an exploration and imitation-learning-based agent capable of state-of-the-art performance in playing text-based computer games. Text-based computer games describe their world to the player through natural language and expect the player to interact with the game using text. These games are of interest as they can be seen as a testbed for language understanding, problem-solving, and language generation by artificial agents. Moreover, they provide a learning environment in which these skills can be acquired through interactions with an environment rather than using fixed corpora. One aspect that makes these games particularly challenging for learning agents is the combinatorially large action space. Existing methods for solving text-based games are limited to games that are either very simple or have an action space restricted to a predetermined set of admissible actions. In this work, we propose to use the exploration approach of Go-Explore (Ecoffet et al., 2019) for solving text-based games. More specifically, in an initial exploration phase, we first extract trajectories with high rewards, after which we train a policy to solve the game by imitating these trajectories. Our experiments show that this approach outperforms existing solutions in solving text-based games, and it is more sample efficient in terms of the number of interactions with the environment. Moreover, we show that the learned policy can generalize better than existing solutions to unseen games without using any restriction on the action space.

Community Implementations: [![CatalyzeX](/images/catalyzex_icon.svg) 2 code implementations](https://www.catalyzex.com/paper/exploration-based-language-learning-for-text/code)

Original Pdf: pdf

9 Replies

Loading