Keywords: multi-task reinforcement learning, morphology-agnostic policies
TL;DR: We propose a method of overcoming state and action disparities across RL domains with an increase in sample efficiency of up to 70%
Abstract: Current multi-task reinforcement learning (MTRL) methods have the ability to perform a large number of tasks with a single policy. However when attempting to interact with a new domain, the MTRL agent would need to be re-trained due to differences in domain dynamics and structure. Because of these limitations, we are forced to train multiple policies even though tasks may have shared dynamics, leading to needing more samples and is thus sample inefficient. In this work, we explore the ability of MTRL agents to learn in various domains with various dynamics by simultaneously learning in multiple domains, without the need to fine-tune extra policies. In doing so we find that a MTRL agent trained in multiple domains induces an increase in sample efficiency of up to 70\% while maintaining the overall success rate of the MTRL agent.
Submission Number: 6
Loading