Off-the-Grid MARL: Datasets with Baselines for Offline Multi-Agent Reinforcement Learning

Juan Claude Formanek; Asad Jeewa; Jonathan Phillip Shock; Arnu Pretorius

Off-the-Grid MARL: Datasets with Baselines for Offline Multi-Agent Reinforcement Learning

Juan Claude Formanek, Asad Jeewa, Jonathan Phillip Shock, Arnu Pretorius

01 Jun 2023 (modified: 12 Dec 2023)Submitted to NeurIPS 2023 Datasets and BenchmarksEveryoneRevisionsBibTeX

Keywords: reinforcement learning, multi-agent reinforcement learning, offline reinforcement learning

TL;DR: Offline MARL is a nascent field which promises to turn large datasets into powerful, decentralised decision making systems. However, progress has been hampered by the lack of high-quality benchmark multi-agent datasets.

Abstract: Being able to harness the power of large datasets for developing cooperative multi-agent controllers promises to unlock enormous value for real-world applications. Many important industrial systems are multi-agent in nature and are difficult to model using bespoke simulators. However, in industry, distributed processes can often be recorded during operation, and large quantities of demonstrative data stored. Offline multi-agent reinforcement learning (MARL) provides a promising paradigm for building effective decentralised controllers from such datasets. However, offline MARL is still in its infancy and therefore lacks standardised benchmark datasets and baselines typically found in more mature subfields of reinforcement learning (RL). These deficiencies make it difficult for the community to sensibly measure progress. In this work, we aim to fill this gap by releasing \emph{off-the-grid MARL (OG-MARL)}: a growing repository of high-quality datasets with baselines for cooperative offline MARL research. Our datasets provide settings that are characteristic of real-world systems, including complex environment dynamics, heterogeneous agents, non-stationarity, many agents, partial observability, suboptimality, sparse rewards and demonstrated coordination. For each setting, we provide a range of different dataset types (e.g. \texttt{Good}, \texttt{Medium}, \texttt{Poor}, and \texttt{Replay}) and profile the composition of experiences for each dataset. We hope that OG-MARL will serve the community as a reliable source of datasets and help drive progress, while also providing an accessible entry point for researchers new to the field.

Supplementary Material: pdf

Submission Number: 819

Loading