Abstract: Generalizing to new tasks with little supervision is a challenge in machine learning and a requirement for future “General AI” agents. Reinforcement and imitation learning is used to adapt to new tasks, but this is difficult for complex tasks that require long-term planning. However, this can be challenging for complex tasks often requiring many timesteps or large numbers of subtasks. This leads to long episodes with long-horizon tasks which are difficult to learn. In this work, we attempt to address these issues by training an Imitation Learning agent using in-episode “near future” subgoals. These sub goals are re-calculated at each step using compositional arithmetic in a learned latent representation space. In addition to improving learning efficiency for standard long-term tasks, this approach also makes it possible to perform one-shot generalization to previously unseen tasks, given only a single reference trajectory for the task in a different environment. Our experiments show that the proposed approach consistently outperforms the previous state-of-the-art compositional Imitation Learning approach by 30%. While capable of learning from long episodes where the SOTA fails.
Loading