A Collaborative Perspective on Exploration in Reinforcement Learning

Yuwei Fu; Haichao Zhang; Di Wu; Wei Xu; Benoit Boulet

A Collaborative Perspective on Exploration in Reinforcement Learning

Yuwei Fu, Haichao Zhang, Di Wu, Wei Xu, Benoit Boulet

19 Sept 2023 (modified: 11 Feb 2024)Submitted to ICLR 2024EveryoneRevisionsBibTeX

Primary Area: reinforcement learning

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.

Keywords: Exploration, Reinforcement Learning, Intrinsic Rewards

Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2024/AuthorGuide.

TL;DR: Collaborative exploration in reinforcement learning with multiple agents

Abstract: Exploration is one of the central topic in reinforcement learning (RL). Many existing approaches take a single agent perspective when tackling this problem. In this work, we view this problem from a different angle by taking a multi-agent perspective. By doing this, we can not only learn with parallel agents, which is not fundamentally different by itself, but more importantly, it unlocks the possibility of introducing collaborative exploration and learning among these agents. We formulate this problem as *Collaborative Exploration* and proposed concrete instantiations. We introduce a collaborative reward generator as a core component to induce collaboration, which can compute novelty of a state not only from one agent's own perspective, but also respect other agents' intrinsic motivation in pursuit of novelty. This leads to collaboration and specialization of each agent within the set of agents. In addition, we discussed how to effectively leverage the shared information from other agents in the data collection and evaluation phases, respectively. Experiments on the DeepMind control suite (DMC) benchmark tasks showcase the effectiveness of the proposed method.

Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors' identity.

Supplementary Material: zip

No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.

Submission Number: 2065

Loading