End-to-End Learning to Follow Language Instructions with Compositional Policies

Vanya Cohen; Geraud Nangue Tasse; Nakul Gopalan; Steven James; Ray Mooney; Benjamin Rosman

End-to-End Learning to Follow Language Instructions with Compositional Policies

Vanya Cohen, Geraud Nangue Tasse, Nakul Gopalan, Steven James, Ray Mooney, Benjamin Rosman

Published: 15 Nov 2022, Last Modified: 05 May 2023LangRob 2022 PosterReaders: Everyone

Keywords: nlp, reinforcement learning

TL;DR: A neural module network composes value functions using natural language inputs to solve compositional language-RL tasks.

Abstract: We develop an end-to-end model for learning to follow language instructions with compositional policies. Our model combines large language models with pretrained compositional value functions to generate policies for goal-reaching tasks specified in natural language. We evaluate our method in the BabyAI environment and demonstrate compositional generalization to novel combinations of task attributes. Notably our method generalizes to held-out combinations of attributes, and in some cases can accomplish those tasks with no additional learning samples.

3 Replies

Loading