Efficient Multi-task Reinforcement Learning via Selective Behavior Sharing

24 Sept 2023 (modified: 11 Feb 2024)Submitted to ICLR 2024EveryoneRevisionsBibTeX
Primary Area: reinforcement learning
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Keywords: Multi-task Reinforcement Learning, Behavior sharing
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2024/AuthorGuide.
TL;DR: Selectively sharing behaviors between tasks improves sample-efficiency for multitask reinforcement learning.
Abstract: Multi-task Reinforcement Learning (MTRL) offers several avenues to address the issue of sample efficiency through information sharing between tasks. However, prior MTRL methods primarily exploit data and parameter sharing, overlooking the potential of sharing learned behaviors across tasks. The few existing behavior-sharing approaches falter because they directly imitate the policies from other tasks, leading to suboptimality when different tasks require different actions for the same states. To preserve optimality, we introduce a novel, generally applicable behavior-sharing formulation that selectively leverages other task policies as the current task's behavioral policy for data collection to efficiently learn multiple tasks simultaneously. Our proposed MTRL framework estimates the shareability between task policies and incorporates them as temporally extended behaviors to collect training data. Empirically, selective behavior sharing improves sample efficiency on a wide range of manipulation, locomotion, and navigation MTRL task families and is complementary to parameter sharing. Result videos are available at [https://sites.google.com/view/qmp-mtrl](https://sites.google.com/view/qmp-mtrl).
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors' identity.
Supplementary Material: zip
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 8641
Loading