Keywords: exploration, reinforcement learning, concurrent reinforcement learning
TL;DR: We introduce coordination between parallel agents in concurrent reinforcement learning by drawing correlated samples from the common posterior.
Abstract: Research on exploration in reinforcement learning has mostly focused on problems with a single agent interacting with an environment. However many problems are better addressed by the concurrent reinforcement learning paradigm, where multiple agents operate in a common environment. Recent work has tackled the challenge of exploration in this particular setting \citep{dimakopoulou2018coordinated,dimakopoulou2018scalable}.
Nonetheless, they do not completely leverage the characteristics of this framework and agents end up behaving independently from each other. In this work we argue that coordination among concurrent agents is crucial for efficient exploration.
We introduce coordination in Thompson Sampling based methods by drawing correlated samples from an agent's posterior.
We apply this idea to extend existing exploration schemes such as randomized least squares value iteration (RLSVI).
Empirical results emphasize the merits of our approach and call attention to coordination as a key objective for efficient exploration in concurrent reinforcement learning.
1 Reply
Loading