Open Peer Review. Open Publishing. Open Access. Open Discussion. Open Directory. Open Recommendations. Open API. Open Source.
Zero-Shot Visual Imitation
Deepak Pathak*, Parsa Mahmoudieh*, Michael Luo*, Pulkit Agrawal*, Dian Chen, Fred Shentu, Evan Shelhamer, Jitendra Malik, Alexei A. Efros, Trevor Darrell
Feb 15, 2018 (modified: Feb 15, 2018)ICLR 2018 Conference Blind Submissionreaders: everyoneShow Bibtex
Abstract:Existing approaches to imitation learning distill both what to do---goals---and how to do it---skills---from expert demonstrations. This expertise is effective but expensive supervision: it is not always practical to collect many detailed demonstrations. We argue that if an agent has access to its environment along with the expert, it can learn skills from its own experience and rely on expertise for the goals alone. We weaken the expert supervision required to a single visual demonstration of the task, that is, observation of the expert without knowledge of the actions. Our method is ``zero-shot'' in that we never see expert actions and never see demonstrations during learning. Through self-supervised exploration our agent learns to act and to recognize its actions so that it can infer expert actions once given a demonstration in deployment. During training the agent learns a skill policy for reaching a target observation from the current observation. During inference, the expert demonstration communicates the goals to imitate while the skill policy determines how to imitate. Our novel skill policy architecture and dynamics consistency loss extend visual imitation to more complex environments while improving robustness. Our zero-shot imitator, having no prior knowledge of the environment and making no use of the expert during training, learns from experience to follow experts for navigating an office with a turtlebot, and manipulating rope with a baxter robot. Videos and detailed result analysis available at https://sites.google.com/view/zero-shot-visual-imitation/home
TL;DR:Agents can learn to imitate solely visual demonstrations (without actions) at test time after learning from their own experience without any form of supervision at training time.
Keywords:imitation, zero shot, self-supervised, robotics, skills, navigation, manipulation
Enter your feedback below and we'll get back to you as soon as possible.