Keywords: Robotics, Locomotion, Efficiency
TL;DR: On-robot learning method with joint target and CPG control architectures to achieve omnidirectional quadruped locomotion in a few minutes of training
Abstract: On-robot Reinforcement Learning is a promising approach to train embodiment-aware policies for legged robots.
However, the computational constraints of real-time learning on robots pose a significant challenge.
We present a framework for efficiently learning quadruped locomotion in just 8 minutes of raw real-time training utilizing the sample efficiency and minimal computational overhead of the new off-policy algorithm CrossQ.
We investigate two control architectures: Predicting joint target positions for agile, high-speed locomotion and Central Pattern Generators for stable, natural gaits.
While prior work focused on learning simple forward gaits, our framework extends on-robot learning to omnidirectional locomotion.
Finally, we demonstrate the robustness of our approach in different indoor and outdoor environments.
Confirmation: I understand that authors of each paper submitted to EWRL may be asked to review 2-3 other submissions to EWRL.
Serve As Reviewer: ~Nico_Bohlinger1
Track: Regular Track: unpublished work
Submission Number: 169
Loading