Lyapunov Guarantees for Learned Policies

Kyle Crandall; Connor Yates; Corbin Wilhelmi

Lyapunov Guarantees for Learned Policies

Kyle Crandall, Connor Yates, Corbin Wilhelmi

Published: 13 Mar 2024, Last Modified: 22 Apr 2024ALA 2024EveryoneRevisionsBibTeXCC BY 4.0

Keywords: Reinforcement Learning, Lyapunov Stability, Stability Guarantees

TL;DR: We use Neural Networks as Lyapunov candidates to learn stability proofs, and corresponding Domains of Attraction for a given policy.

Abstract: Reinforcement Learned policies are notorious in their lack of supporting proofs and guarantees. One approach to providing such guarantees is learning a Domain of Attraction that proves stability based on the Lyapunov Stability Criterion. We build on this approach to improve performance and ease of implementation, and present a highly parallelizable algorithm that produces a uniform grid that tessellates the desired region of the state space. By discretizing the state space, we take advantage of the Lipschitz nature of the problem to prove that a not only is a sample point stable, but so is a neighborhood of it. This discretization is then combined with existing algorithms to learn a neural network that can be used as a Lyapunov candidate. We present our proposed algorithm, and demonstrate it on a torque limited inverted pendulum, as well as highlight effects of our improvements in experimental results.

Type Of Paper: Work-in-progress paper (max page 6)

Anonymous Submission: Anonymized submission.

Submission Number: 4

Loading