Natural Language Instruction-following with Task-related Language Development and Translation

Jing-Cheng Pang; Xinyu Yang; Si-Hang Yang; Xiong-Hui Chen; Yang Yu

Natural Language Instruction-following with Task-related Language Development and Translation

Jing-Cheng Pang, Xinyu Yang, Si-Hang Yang, Xiong-Hui Chen, Yang Yu

Published: 21 Sept 2023, Last Modified: 02 Nov 2023NeurIPS 2023 posterEveryoneRevisionsBibTeX

Keywords: Reinforcement learning, instruction-following, autonomous agent

TL;DR: We propose a reinforcement learning algorithm for enabling efficient policy learning in natural language instruction following.

Abstract: Natural language-conditioned reinforcement learning (RL) enables agents to follow human instructions. Previous approaches generally implemented language-conditioned RL by providing the policy with human instructions in natural language (NL) and training the policy to follow instructions. In this is outside-in approach, the policy must comprehend the NL and manage the task simultaneously. However, the unbounded NL examples often bring much extra complexity for solving concrete RL tasks, which can distract policy learning from completing the task. To ease the learning burden of the policy, we investigate an inside-out scheme for natural language-conditioned RL by developing a task language (TL) that is task-related and easily understood by the policy, thus reducing the policy learning burden. Besides, we employ a translator to translate natural language into the TL, which is used in RL to achieve efficient policy training. We implement this scheme as TALAR (TAsk Language with predicAte Representation) that learns multiple predicates to model object relationships as the TL. Experiments indicate that TALAR not only better comprehends NL instructions but also leads to a better instruction-following policy that significantly improves the success rate over baselines and adapts to unseen expressions of NL instruction. Besides, the TL is also an effective sub-task abstraction compatible with hierarchical RL.

Supplementary Material: zip

Submission Number: 172

Loading