Meta-World+: An Improved, Standardized, RL Benchmark

Published: 18 Sept 2025, Last Modified: 30 Oct 2025NeurIPS 2025 Datasets and Benchmarks Track posterEveryoneRevisionsBibTeXCC BY 4.0
Keywords: multi-task reinforcement learning, meta-reinforcement learning
TL;DR: Undocumented versions of Meta-World have clouded algorithmic performance. This work strives to disambiguate Meta-World results from the literature, while also providing insights into benchmark design.
Abstract: Meta-World is widely used for evaluating multi-task and meta-reinforcement learning agents, which are challenged to master diverse skills simultaneously. Since its introduction however, there have been numerous undocumented changes which inhibit a fair comparison of algorithms. This work strives to disambiguate these results from the literature, while also leveraging the past versions of Meta-World to provide insights into multi-task and meta-reinforcement learning benchmark design. Through this process we release an open-source version of Meta-World that has full reproducibility of past results, is more technically ergonomic, and gives users more control over the tasks that are included in a task set.
Primary Area: Data for Reinforcement learning (e.g., decision and control, planning, hierarchical RL, robotics)
Submission Number: 446
Loading