Keywords: Reinforcement Learning, Lyapunov Stability, Stability Guarantees
TL;DR: We use Neural Networks as Lyapunov candidates to learn stability proofs, and corresponding Domains of Attraction for a given policy.
Abstract: Reinforcement Learned policies are notorious in their lack of supporting proofs and guarantees.
One approach to providing such guarantees is learning a Domain of Attraction that proves stability based on the Lyapunov Stability Criterion.
We build on this approach to improve performance and ease of implementation, and present a highly parallelizable algorithm that produces a uniform grid that tessellates the desired region of the state space.
By discretizing the state space, we take advantage of the Lipschitz nature of the problem to prove that a not only is a sample point stable, but so is a neighborhood of it.
This discretization is then combined with existing algorithms to learn a neural network that can be used as a Lyapunov candidate.
We present our proposed algorithm, and demonstrate it on a torque limited inverted pendulum, as well as highlight effects of our improvements in experimental results.
Type Of Paper: Work-in-progress paper (max page 6)
Anonymous Submission: Anonymized submission.
Submission Number: 4
Loading