Keywords: feudal reinforcement learning, textual instruction following, reading to act, text games, multi-hop reasoning
Abstract: Reading to act is a prevalent but challenging task that requires the ability to follow a concise language instruction in an environment, with the help of textual knowledge about the environment. Previous works face the semantic mismatch between the low-level actions and the high-level language descriptions and require the human-designed curriculum to work properly. In this paper, we present a Feudal Reinforcement Learning (FRL) model consisting of a manager agent and a worker agent. The manager agent is a multi-hop planner, which deals with high-level abstract information and generates a series of sub-goals. The worker agent deals with the low-level perceptions and actions to achieve the sub-goals one by one. Our FRL framework effectively alleviates the mismatching between the text-level inference and the low-level perceptions and actions; and is general to various forms of environments, instructions and manuals. Our multi-hop planner contributes to the framework by further boosting the challenging tasks where multi-step reasoning from the texts is critical to achieving the instructed goals. We showcase our approach achieves competitive performance on two challenging tasks, Read to Fight Monsters (RTFM) and Messenger, without human-designed curriculum learning.
One-sentence Summary: We propose a feudal reinforcement learning framework for the task of instruction following, which addresses the mismatch between high-level language descriptions and low-level perceptions.
13 Replies
Loading