Keywords: human-in-the-loop, deep reinforcement learning, visual policies, conditional policies, robot learning
Abstract: In real-world scenarios, robots can leverage embodied reinforcement learning (RL) agents to solve continuous control problems that are difficult to model under partial observability. Especially when the control inputs are high-dimensional, RL agents can require extensive experience to learn correct mappings from the input space to action space, a serious limitation given the lack of sufficiently large real-world robotics datasets. Recent work approaches this problem by training agents in synthetic data domains or bootstrapping learning with direct human supervision. They are often difficult to apply to the target domain due to large distribution shift between the training and deployment setting \citep{Zhao+20,ChenHu+22,Chae+22}.
%
We propose a novel learning framework, called Human-INTuited cues for RL, or \hints, in which agents quickly learn to solve tasks by leveraging human coaching. Our experiments in classic control, navigation, and locomotion reveal that \hints\ enables agents to learn more quickly than vision-only agents and to obtain strategies that apply to more challenging settings.
Primary Area: reinforcement learning
Submission Number: 19230
Loading