Keywords: transfer reinforcement learning, different observation spaces, auxiliary tasks, representation learning
TL;DR: We propose a novel algorithm that transfers knowledge across tasks with different observation spaces, without any prior knowledge about an inter-task mapping.
Abstract: In many reinforcement learning (RL) applications, the observation space is specified by human developers and restricted by physical realizations, and may thus be subject to dramatic changes over time (e.g. increased number of observable features). However, when the observation space changes, the previous policy usually fails due to the mismatch of input features, and therefore one has to train another policy from scratch, which is computationally and sample inefficient. In this paper, we propose a novel algorithm that extracts the latent-space dynamics in the source task, and transfers the dynamics model to the target task with a model-based regularizer. Theoretical analysis shows that the transferred dynamics model helps with representation learning in the target task. Our algorithm works for drastic changes of observation space (e.g. from vector-based observation to image-based observation), without any inter-task mapping or any prior knowledge of the target task. Empirical results have justified that our algorithm significantly improves the efficiency and stability of learning in the target task.
Supplementary Material: zip