Online Multi-Task Learning for Policy Gradient Methods

Haitham Bou-Ammar, Eric Eaton, Paul Ruvolo, Matthew E. Taylor

2014 (modified: 11 Nov 2022)ICML 2014Readers: Everyone

Abstract: Policy gradient algorithms have shown considerable recent success in solving high-dimensional sequential decision making tasks, particularly in robotics. However, these methods often require extensive experience in a domain to achieve high performance. To make agents more sample-efficient, we developed a multi-task policy gradient method to learn decision making tasks consecutively, transferring knowledge between tasks to accelerate learning. Our approach provides robust theoretical guarantees, and we show empirically that it dramatically accelerates learning on a variety of dynamical systems, including an application to quadrotor control.

0 Replies