Alchemy: A benchmark and analysis toolkit for meta-reinforcement learning agents

Jane X Wang; Michael King; Nicolas Pierre Mickael Porcel; Zeb Kurth-Nelson; Tina Zhu; Charlie Deck; Peter Choy; Mary Cassin; Malcolm Reynolds; H. Francis Song; Gavin Buttimore; David P Reichert; Neil Charles Rabinowitz; Loic Matthey; Demis Hassabis; Alexander Lerchner; Matthew Botvinick

Alchemy: A benchmark and analysis toolkit for meta-reinforcement learning agents

Jane X Wang, Michael King, Nicolas Pierre Mickael Porcel, Zeb Kurth-Nelson, Tina Zhu, Charlie Deck, Peter Choy, Mary Cassin, Malcolm Reynolds, H. Francis Song, Gavin Buttimore, David P Reichert, Neil Charles Rabinowitz, Loic Matthey, Demis Hassabis, Alexander Lerchner, Matthew Botvinick

Published: 11 Oct 2021, Last Modified: 23 May 2023NeurIPS 2021 Datasets and Benchmarks Track (Round 2)Readers: Everyone

Keywords: meta-reinforcement learning, benchmark, deep RL, agent analysis

TL;DR: We introduce a new meta-reinforcement learning benchmark that tests and analyzes deep RL agents' abilities to perform structured latent inference.

Abstract: There has been rapidly growing interest in meta-learning as a method for increasing the flexibility and sample efficiency of reinforcement learning. One problem in this area of research, however, has been a scarcity of adequate benchmark tasks. In general, the structure underlying past benchmarks has either been too simple to be inherently interesting, or too ill-defined to support principled analysis. In the present work, we introduce a new benchmark for meta-RL research, emphasizing transparency and potential for in-depth analysis as well as structural richness. Alchemy is a 3D video game, implemented in Unity, which involves a latent causal structure that is resampled procedurally from episode to episode, affording structure learning, online inference, hypothesis testing and action sequencing based on abstract domain knowledge. We evaluate a pair of powerful RL agents on Alchemy and present an in-depth analysis of one of these agents. Results clearly indicate a frank and specific failure of meta-learning, providing validation for Alchemy as a challenging benchmark for meta-RL. Concurrent with this report, we are releasing Alchemy as public resource, together with a suite of analysis tools and sample agent trajectories.

URL: https://github.com/deepmind/dm_alchemy

Supplementary Material: pdf

Contribution Process Agreement: Yes

License: The Alchemy environment and analysis tools can be found at https://github.com/deepmind/ dm_alchemy and are released under the Apache License 2.0. Further licensing details and all documentation can be found at this repository.

Author Statement: Yes

17 Replies

Loading