Open Peer Review. Open Publishing. Open Access. Open Discussion. Open Directory. Open Recommendations. Open API. Open Source.
Third Person Imitation Learning
Bradly C Stadie, Pieter Abbeel, Ilya Sutskever
Nov 04, 2016 (modified: Mar 06, 2017)ICLR 2017 conference submissionreaders: everyone
Abstract:Reinforcement learning (RL) makes it possible to train agents capable of achieving
sophisticated goals in complex and uncertain environments. A key difficulty in
reinforcement learning is specifying a reward function for the agent to optimize.
Traditionally, imitation learning in RL has been used to overcome this problem.
Unfortunately, hitherto imitation learning methods tend to require that demonstrations
are supplied in the first-person: the agent is provided with a sequence of
states and a specification of the actions that it should have taken. While powerful,
this kind of imitation learning is limited by the relatively hard problem of collecting
first-person demonstrations. Humans address this problem by learning from
third-person demonstrations: they observe other humans perform tasks, infer the
task, and accomplish the same task themselves.
In this paper, we present a method for unsupervised third-person imitation learning.
Here third-person refers to training an agent to correctly achieve a simple
goal in a simple environment when it is provided a demonstration of a teacher
achieving the same goal but from a different viewpoint; and unsupervised refers
to the fact that the agent receives only these third-person demonstrations, and is
not provided a correspondence between teacher states and student states. Our
methods primary insight is that recent advances from domain confusion can be
utilized to yield domain agnostic features which are crucial during the training
process. To validate our approach, we report successful experiments on learning
from third-person demonstrations in a pointmass domain, a reacher domain, and
TL;DR:Agent watches another agent at a different camera angle completing the task and learns via raw pixels how to imitate.