Keywords: Reinforcement learning, multi-task learning, representation learning
Abstract: The deep reinforcement learning (RL) framework has shown great promise to tackle sequential decision-making problems, where the agent learns to behave optimally through interactions with the environment and receiving rewards.
The ability of an RL agent to learn different reward functions concurrently has many benefits, such as the decomposition of task rewards and skill reuse. One obstacle for achieving this, is the amount of data required as well as the capacity of the model for solving multiple tasks. In this paper, we consider the problem of continuous control for various robot manipulation tasks with an explicit representation that promotes skill reuse while learning multiple tasks, related through the reward function. Our approach relies on two key concepts: successor features (SF), a value function representation that decouples the dynamics of the environment from the rewards, and an actor-critic framework that incorporates the learned SF representations.
We propose a practical implementation of successor features in continuous action spaces. We first show how to learn a decomposable representation required by SF. Our proposed methods, is able to learn decoupled state and reward features representations. We study this approach on a non-trivial continuous control problems with compositional structure built into the reward functions of various tasks.
One-sentence Summary: Successor feature framework extension to continuous control for multi-task learning
Supplementary Material: zip
11 Replies
Loading