- TL;DR: Weakly supervised segmentation of human demonstrations into skill primitives by only using trajectory-level labels at training with neither time-step segmentation labels nor ordering information.
- Abstract: Learning useful and reusable skill, or sub-task primitives, is a long-standing problem in sensorimotor control. This is challenging because it's hard to define what constitutes a useful skill. Instead of direct manual supervision which is tedious and prone to bias, in this work, our goal is to extract reusable skills from a collection of human demonstrations collected directly for several end-tasks. We propose a weakly-supervised approach for trajectory segmentation following the classic work on multiple instance learning. Our approach is end-to-end trainable, works directly from high-dimensional input (e.g., images) and only requires the knowledge of what skill primitives are present at training, without any need of segmentation or ordering of primitives. We evaluate our approach via rigorous experimentation across four environments ranging from simulation to real world robots, procedurally generated to human collected demonstrations and discrete to continuous action space. Finally, we leverage the generated skill segmentation to demonstrate preliminary evidence of zero-shot transfer to new combinations of skills. Result videos at https://sites.google.com/view/trajectory-segmentation/
- Keywords: skills, demonstration, agent, sub-task, primitives, robot learning, manipulation