Keywords: Human-to-Robot Learning, Affordance Learning, Learning from Observation, Egocentric Vision, Reinforcement Learning, Imitation Learning, Domain Transfer
TL;DR: Robots learn manipulation skills by watching human videos. Our method uses learned "affordances" to create a semantic reward signal, bridging the visual gap and enabling efficient learning without robot demonstrations
Abstract: This paper addresses the Human-to-Robot (H2R) transfer problem by introducing Affordance-Guided Reinforcement Learning (AGRL), a framework that enables robots to learn manipulation policies from unstructured human videos. Our key insight is to use scene affordances—learned from large-scale egocentric datasets like EPIC-KITCHENS—as a transferable, semantic reward signal to guide robot policy learning in benchmarks like RLBench. Experiments show that AGRL significantly outperforms prior learning-from-observation methods in both success rate and sample efficiency, providing a scalable pathway for translating human experience into robot skills without demonstrations.
Submission Number: 28
Loading