D5RL: Diverse Datasets for Data-Driven Deep Reinforcement Learning

23 Sept 2023 (modified: 11 Feb 2024)Submitted to ICLR 2024EveryoneRevisionsBibTeX
Primary Area: datasets and benchmarks
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Keywords: Offline RL, Imitation Learning, Representation Learning
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2024/AuthorGuide.
TL;DR: New set of tasks and datasets for realistic offline RL algorithm development and evaluation
Abstract: Offline reinforcement learning algorithms hold the promise of enabling data-driven RL methods that do not require costly or dangerous real-world exploration and benefit from large pre-collected datasets. This in turn can facilitate real-world applications, as well as a more standardized approach to RL research. Furthermore, offline RL methods can provide effective initializations for online finetuning, overcoming challenges with exploration. However, evaluating progress on offline RL algorithms requires effective and challenging benchmarks that capture properties of real-world tasks, provide a range of task difficulties, and cover a range of challenges both in terms of the parameters of the domain (e.g., length of the horizon, sparsity of rewards) and the parameters of the data (e.g., narrow demonstration data or broad exploratory data). While considerable progress in offline RL in recent years has been enabled by simpler benchmark tasks, the most widely used datasets are increasingly saturating in performance and might fail to reflect properties of realistic tasks. We propose a new benchmark for offline RL that focuses on realistic simulations of robotic manipulation and locomotion environments, based on models of real-world robotic systems, and comprising a variety of data sources, including scripted data, over 20 hours of demonstrations and play-style data collected by human teleoperators, and other data sources. Our proposed benchmark covers state-based and image-based domains, and aims to test a number of real-world robot training challenges such as long-horizon manipulation, fine-grained motor control, imperfect controllers, and representation learning. Our proposed tasks vary in complexity from single instance to diverse scenarios with multiple distribution shifts, which can require significant robustness and generalization. Moreover, we support both offline RL evaluation and evaluation with online finetuning, with some of the tasks specifically designed to require both pretraining and finetuning. We hope that our proposed benchmark will facilitate further progress on both offline RL algorithms and algorithms designed for online finetuning from offline initialization.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors' identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 7967
Loading