A Self-Supervised Method for Mapping Human Instructions to Robot PoliciesDownload PDF

27 Sep 2018 (modified: 21 Dec 2018)ICLR 2019 Conference Blind SubmissionReaders: Everyone
  • Abstract: In this paper, we propose a modular approach which separates the instruction-to-action mapping procedure into two separate stages. The two stages are bridged via an intermediate representation called a goal, which stands for the result after a robot performs a specific task. The first stage maps an input instruction to a goal, while the second stage maps the goal to an appropriate policy selected from a set of robot policies. The policy is selected with an aim to guide the robot to reach the goal as close as possible. We implement the above two stages as a framework consisting of two distinct modules: an instruction-goal mapping module and a goal-policy mapping module. Given a human instruction in the evaluation phase, the instruction-goal mapping module first translates the instruction to a robot-interpretable goal. Once a goal is derived by the instruction-goal mapping module, the goal-policy mapping module then follows up to search through the goal-policy pairs to look for policy to be mapped by the instruction. Our experimental results show that the proposed method is able to learn an effective instruction-to-action mapping procedure in an environment with a given instruction set more efficiently than the baselines. In addition to the impressive data-efficiency, the results also show that our method can be adapted to a new instruction set and a new robot action space much faster than the baselines. The evidence suggests that our modular approach does lead to better adaptability and efficiency.
4 Replies