Robot Trains Robot: Automatic Real-World Policy Adaptation and Learning for Humanoids

Published: 18 Jun 2025, Last Modified: 18 Jun 2025RSS 2025 Hardware Intelligence OralEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Humanoid Robots, Sim-to-Real Adaptation, Real-World RL
TL;DR: Robot-Trains-Robot uses a robot arm teacher to actively train a humanoid student, aiming for practical and highly efficient real-world humanoid adaptation and learning.
Abstract: Simulation-based reinforcement learning (RL) has significantly advanced humanoid locomotion tasks, yet direct real-world RL from scratch or starting from pretrained policies remains rare, limiting the full potential of humanoid robots. Real-world training, despite being crucial for overcoming the sim-to-real gap, faces substantial challenges related to safety, reward design, and learning efficiency. To address these limitations, we propose Robot-Trains-Robot (RTR), a novel framework where a robotic arm teacher actively supports and guides a humanoid student robot. The RTR system provides protection, schedule, reward, perturbation, failure detection, and automatic resets, enabling efficient long-term real-world training with minimal human intervention. Furthermore, we propose a novel RL pipeline that facilitates and stabilizes sim-to-real transfer by optimizing a single dynamics-encoded latent variable in the real world. We validate our method through two challenging real-world humanoid tasks: fine-tuning a walking policy for precise speed tracking and learning a humanoid swing-up task from scratch, illustrating the promising capabilities of real-world humanoid learning realized by RTR-style systems.
Submission Number: 4
Loading