Keywords: Quadruped Locomotion, Reinforcement Learning, Reward Machine
TL;DR: We specify and learn diverse quadruped locomotion gaits via Reward Machines, and deploy on hardware.
Abstract: Learning diverse locomotion gaits for legged robots is important in order to efficiently and robustly move in different environments. Learning a specified gait frequently requires a reward function that accurately describes the gait. Our objective is to develop a simple mechanism for specifying the gaits at a high level (e.g. alternate between moving front feet and back feet), without providing labor-intensive motion priors such as reference trajectories. In this work, we leverage a recently developed framework called Reward Machine (RM) for high-level gait specification using Linear Temporal Logic (LTL) formulas over foot contacts. Our RM-based approach, called Reward Machine based Locomotion Learning (RMLL), facilitates the learning of specified locomotion gaits, while providing a mechanism to dynamically adjust gait frequency. This is accomplished without the use of motion priors. Experimental results in simulation indicates that leveraging RM in learning specified gaits is more sample-efficient than baselines which do not utilize RM. We also demonstrate these learned policies with a real quadruped robot.
Submission Number: 11
Loading