Keywords: Reinforcement Learning, Control, Stability, Lyapunov
TL;DR: A novel method to compute Lyapunov functions using off-policy data that can provide stability certificates to reinforcement learning algorihtms.
Abstract: Traditional reinforcement learning lacks the ability to provide stability guarantees. More recent algorithms learn Lyapunov functions alongside the control policies to ensure stable learning. However, the current self-learned Lyapunov functions are sample inefficient due to their on-policy nature. This paper introduces a method for learning Lyapunov functions off-policy and incorporates the proposed off-policy Lyapunov function into the Soft Actor Critic and Proximal Policy Optimization algorithms to provide them with a data efficient stability certificate. Simulations of an inverted pendulum and a quadrotor illustrate the improved performance of the two algorithms when endowed with the proposed off-policy Lyapunov function.
Spotlight: zip
Submission Number: 412
Loading