Keywords: GAN, information regularization, generalizable policy, task context space
TL;DR: Task context encoding in RL via hypothetical image generation
Abstract: Learning compact state representations from high dimensional and noisy observations is the cornerstone of reinforcement learning (RL). However, these representations are often biased toward the current task context and overfitted to context-irrelevant features, making it hard to generalize to other tasks. Inspired by the human analogy-making process, we propose a novel representation learning framework called Hypothetical Analogy-Making (HAM) for learning robust task contexts and generalizable policy for RL. It consists of task context and background encoding, hypothetical observation generation, and analogy-making between the original and hypothetical observations. Our model introduces an auxiliary objective that maximizes the mutual information between the generated observation and existing labels of codes used to generate the observation. Experiments on various challenging RL environments showed that our model helps the RL agent’s learned policy generalize by revealing a robust task context space.