Keywords: Reinforcement Learning, Multi-Task Learning, Mode Connectivity, Mixture of Experts
TL;DR: An effective Multi-Task Reinforcement Learning method that leverages Mode Connectivity via learning a subspace of task-specific policies that are linearly connected to the multi-task one.
Abstract: Acquiring a universal policy that performs multiple tasks is a crucial building block in endowing agents with generalized capabilities. To this end, the field of Multi-Task Reinforcement Learning (MTRL) proposes sharing parameters and representations among tasks during the learning process. Still, optimizing for a single solution that is able to perform various skills remains challenging. Recent works attempt to address these challenges using a mixture of experts, though this comes at the cost of additional inference-time complexity. In this paper, we introduce STAR, a novel MTRL algorithm that leverages mode connectivity to share knowledge across single skills, while remaining parameter-efficient at deployment time. Particularly, we show that single-task policies can be linearly connected in policy parameter space to the multi-task policy, i.e., task performance is maintained throughout the linear path connecting the two policies. Our experimental evaluation demonstrates that mode connectivity at training time induces implicit regularization in the multi-task policy, surpassing related baselines on MTRL benchmarks MuJoCo and Metaworld. Furthermore, STAR achieves competitive performance even with methods that retain multiple models at inference time.
Confirmation: I understand that authors of each paper submitted to EWRL may be asked to review 2-3 other submissions to EWRL.
Serve As Reviewer: ~Ahmed_Hendawy1
Track: Regular Track: unpublished work
Submission Number: 62
Loading