HINTs: Human-INTuited Cues for Reinforcement Learning

ICLR 2026 Conference Submission19230 Authors

19 Sept 2025 (modified: 08 Oct 2025)ICLR 2026 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: human-in-the-loop, deep reinforcement learning, visual policies, conditional policies, robot learning
Abstract: In real-world scenarios, robots can leverage embodied reinforcement learning (RL) agents to solve continuous control problems that are difficult to model under partial observability. Especially when the control inputs are high-dimensional, RL agents can require extensive experience to learn correct mappings from the input space to action space, a serious limitation given the lack of sufficiently large real-world robotics datasets. Recent work approaches this problem by training agents in synthetic data domains or bootstrapping learning with direct human supervision. They are often difficult to apply to the target domain due to large distribution shift between the training and deployment setting \citep{Zhao+20,ChenHu+22,Chae+22}. % We propose a novel learning framework, called Human-INTuited cues for RL, or \hints, in which agents quickly learn to solve tasks by leveraging human coaching. Our experiments in classic control, navigation, and locomotion reveal that \hints\ enables agents to learn more quickly than vision-only agents and to obtain strategies that apply to more challenging settings.
Primary Area: reinforcement learning
Submission Number: 19230
Loading