Human-Agent Collaboration Strategies for Vision-Grounded Instruction Following

Guan-Lin Chao, Ian R. Lane

Published: 01 Jan 2021, Last Modified: 18 Jun 2024ASRU 2021EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: In this paper, we explore two human-agent collaboration strate-gies (human-designed curriculum learning and human-agent dialogue) to boost Reinforcement Learning agent's training in the context of vision-grounded instruction following. The agent is given a text instruction to fetch the described object or switch an appliance. The agent needs to navigate in a simulated multi-room environment, interacts with the objects, furniture, appliances to follow the instruction. Our first strategy is training the agent with a human-designed curriculum. We create a series of subtasks and let the human advisor select several subtasks and design the curriculum composed of the selected tasks in increasing order of complexity. Then the agent is first trained on the human-designed curriculum before training on the target task. Secondly, we enhance the agent with a dialogue module, which allows the agent to query a human collaborator (with a limit number of dialogue exchanges) and obtain three types of hints: action selection, navigation and recognition. Experiments show that the training with a properly designed subtask curriculum can accelerate the performance growth on the target task compared with directly training on the target task, and incorporating a human collaborator dialogue mod-ule further improves the agent's task success rate and route efficiency.