Keywords: Model Based RL, Rigid Body Motion, Lagrangian Neural Networks
Abstract: We are interested in reinforcement learning (RL) for physical systems. One of the drawbacks of traditional RL algorithms has been their poor sample efficiency. One approach to improve it is model-based RL. In our algorithm, we learn a model of the environment, essentially its transition dynamics and reward function, use it to generate imaginary trajectories and backpropagate through them to update the policy, exploiting the differentiability of the model. Intuitively, learning more accurate models should lead to better performance. Recently, there has been growing interest in developing better deep neural network based dynamics models for physical systems, through better inductive biases. We focus on systems undergoing rigid body motion. We compare two versions of our model-based RL algorithm, one which uses a standard deep neural network based dynamics model and the other which uses a Lagrangian Neural Network based dynamics model, which utilizes the structure of the underlying physics. We find that, in environments that are not sensitive to initial conditions, both versions achieve similar average-return, while the physics-informed version achieves better sample efficiency. Whereas, in environments that are sensitive to initial conditions, the physics-informed version achieves significantly better average-return and sample efficiency. In these latter environments, our physics-informed model-based RL approach achieves better average-return than Soft Actor-Critic, a state-of-the-art model-free RL algorithm.