Mitigating Conflicts in Multi-Task Reinforcement Learning via Progressively-Trained Dynamic Policy Network
Keywords: reinforcement learning, continual learning, multi-task learning
Abstract: Reinforcement learning is widely applied in various fields, including game playing, robotic control and autonomous driving. However, we find that, when trained for multi-tasking where there exist inter-task conflicts, the standard reinforcement learning algorithm may yield limited performance on individual tasks. To mitigate this, we introduce a dynamic policy network that incorporates diverse computational pathways of varying depths, along with gating modules that selectively activate the appropriate pathways for different tasks. This design, equipped with better flexibility, allows the network to achieve improved multi-task performance. Second, we propose a progressive training technique to mitigate the conflicts among tasks by leveraging proper training order and continual learning techniques. Using the dynamic policy network design and the progressive training technique, we successfully trained a policy capable of performing seven quadrupedal locomotion tasks and a policy that achieved an improved final average reward on ten MiniHack games.
Supplementary Material: zip
Primary Area: reinforcement learning
Submission Number: 15080
Loading