Stabilizing Dynamical Systems via Policy Gradient Methods

Juan Carlos Perdomo; Jack Umenberger; Max Simchowitz

Stabilizing Dynamical Systems via Policy Gradient Methods

Juan Carlos Perdomo, Jack Umenberger, Max Simchowitz

Published: 09 Nov 2021, Last Modified: 05 May 2023NeurIPS 2021 PosterReaders: Everyone

Keywords: control theory, stability, LQR, policy gradients

TL;DR: We introduce model free, policy gradient algorithms that find stabilizing controllers for unknown linear and nonlinear dynamical systems.

Abstract: Stabilizing an unknown control system is one of the most fundamental problems in control systems engineering. In this paper, we provide a simple, model-free algorithm for stabilizing fully observed dynamical systems. While model-free methods have become increasingly popular in practice due to their simplicity and flexibility, stabilization via direct policy search has received surprisingly little attention. Our algorithm proceeds by solving a series of discounted LQR problems, where the discount factor is gradually increased. We prove that this method efficiently recovers a stabilizing controller for linear systems, and for smooth, nonlinear systems within a neighborhood of their equilibria. Our approach overcomes a significant limitation of prior work, namely the need for a pre-given stabilizing control policy. We empirically evaluate the effectiveness of our approach on common control benchmarks.

Code Of Conduct: I certify that all co-authors of this work have read and commit to adhering to the NeurIPS Statement on Ethics, Fairness, Inclusivity, and Code of Conduct.

Supplementary Material: pdf

Code: zip

10 Replies

Loading