Keywords: Reinforcement Learning, Transfer Learning, Minimum Description Length
TL;DR: We propose an approach for discovering reusable skills from a large, offline dataset based on minimizing description length and show that they can be used to accelerate RL on downstream tasks.
Abstract: Humans can quickly learn new tasks by reusing a large number of previously acquired skills. How can we discover such reusable skills for artificial agents when given a large dataset of prior experience? Past works leverage extensive human supervision to define skills or use simple skill heuristics that limit their expressiveness. In contrast, we propose a principled, unsupervised objective for skill discovery from large, offline datasets based on the Minimum Description Length principle: we show that a "code book" of skills that can maximally compress the training data can be reused to efficiently learn new tasks. By minimizing description length we strike an optimal balance between the number of extracted skills and their complexity. We show that our approach outperforms alternative approaches that heuristically define skills on a complex, long-horizon maze navigation task.