Subwords as Skills: Tokenization for Sparse-Reward Reinforcement Learning

David Yunis; Justin Jung; Falcon Z Dai; Matthew Walter

Subwords as Skills: Tokenization for Sparse-Reward Reinforcement Learning

David Yunis, Justin Jung, Falcon Z Dai, Matthew Walter

Published: 25 Sept 2024, Last Modified: 06 Nov 2024NeurIPS 2024 posterEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Reinforcement Learning, Deep Learning, Exploration, Hierarchical RL

TL;DR: Tokenization methods from NLP lead to much faster skill-extraction which can help solve very difficult sparse-reward RL tasks.

Abstract: Exploration in sparse-reward reinforcement learning (RL) is difficult due to the need for long, coordinated sequences of actions in order to achieve any reward. Skill learning, from demonstrations or interaction, is a promising approach to address this, but skill extraction and inference are expensive for current methods. We present a novel method to extract skills from demonstrations for use in sparse-reward RL, inspired by the popular Byte-Pair Encoding (BPE) algorithm in natural language processing. With these skills, we show strong performance in a variety of tasks, 1000$\times$ acceleration for skill-extraction and 100$\times$ acceleration for policy inference. Given the simplicity of our method, skills extracted from 1\% of the demonstrations in one task can be transferred to a new loosely related task. We also note that such a method yields a finite set of interpretable behaviors. Our code is available at https://github.com/dyunis/subwords_as_skills.

Primary Area: Reinforcement learning

Submission Number: 20181

Loading