Learning Robot Control: From Reinforcement Learning to Differentiable Simulation

Published: 24 Jun 2024, Last Modified: 24 Jun 2024EARL 2024 PosterEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Reinforcement Learning, Robot Control, Differentiable Simulation, Optimal Control
TL;DR: We provide new insights for learning robot control by bridging the gap between learning-centric policy training and model-based control.
Abstract: We provide new insights for learning robot control by bridging the gap between learning-centric policy training and model-based control. We leverage principles from optimal control, reinforcement learning, and differentiable simulation to develop control algorithms that enhance the robot’s agility while maintaining robustness in real-world scenarios. First, we show that the fundamental advantage of reinforcement learning (RL) in robotics lies in its optimization objective compared to optimal control. Specifically, RL directly maximizes a task-level objective, which can be non-differentiable, whereas optimal control is restricted by the requirement for smooth and differentiable cost functions. The flexibility in objective design allows for achieving more flexible control policies, leading to more robust performance in unexpected scenarios. Second, we propose using policy search to automatically optimize high-level policies for model predictive control (MPC). This formulation enables policy search to maximize a high-level task objective, while the MPC optimization can concentrate on low-level tracking performance. Third, we explore the potential of differentiable simulation for policy training. Differentiable simulation can provide low-variance first-order gradients, resulting in more stable training and better convergence. We show near-optimal control performance for a toy double integrator and its potential for quadruped locomotion.
Submission Number: 2
Loading