KhGRL: Kernelized human-Guided Reinforcement Learning

Published: 08 May 2026, Last Modified: 08 May 2026ICRA 2026 Workshop RL4IL OralEveryoneRevisionsCC BY 4.0
Keywords: Reinforcement Learning, Human in the loop, Imitation Learning
Abstract: Learning from Demonstration (LfD) enables efficient synthesis of user-taught behaviors, while Reinforcement Learning (RL) allows autonomous skill acquisition in complex real-world environments. The Kernelized Guided Reinforcement Learning (KGRL) framework unifies these paradigms by guiding policy exploration using the covariance of user demonstrations and predefined hard constraints, thereby ensuring safety and sample-efficient learning. However, model uncertainty may in some cases lead to time-consuming exploration or to policies that are insufficiently generalizable when the environment changes. We extend KGRL by integrating human feedback that maps observation-space corrections into the corresponding action space and stores them in the replay buffer to accelerate exploration. Additionally, we propose a region-dependent action scaling factor learned via regression. This enables locally optimal exploration and compensates for reduced covariance guidance in low-uncertainty regions. We validate the method in a simulation environment that requires obstacle avoidance and goal-reaching.
Email Sharing: We authorize the sharing of all author emails with Program Chairs.
Data Release: We authorize the release of our submission and author names to the public in the event of acceptance.
Submission Number: 19
Loading