Off Policy Lyapunov Stability in Reinforcement Learning

Sarvan Gill; Daniela Constantinescu

Off Policy Lyapunov Stability in Reinforcement Learning

Sarvan Gill, Daniela Constantinescu

Published: 08 Aug 2025, Last Modified: 16 Sept 2025CoRL 2025 PosterEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Reinforcement Learning, Control, Stability, Lyapunov

TL;DR: A novel method to compute Lyapunov functions using off-policy data that can provide stability certificates to reinforcement learning algorihtms.

Abstract: Traditional reinforcement learning lacks the ability to provide stability guarantees. More recent algorithms learn Lyapunov functions alongside the control policies to ensure stable learning. However, the current self-learned Lyapunov functions are sample inefficient due to their on-policy nature. This paper introduces a method for learning Lyapunov functions off-policy and incorporates the proposed off-policy Lyapunov function into the Soft Actor Critic and Proximal Policy Optimization algorithms to provide them with a data efficient stability certificate. Simulations of an inverted pendulum and a quadrotor illustrate the improved performance of the two algorithms when endowed with the proposed off-policy Lyapunov function.

Spotlight: zip

Submission Number: 412

Loading