Online Control Basis Selection by a Regularized Actor Critic Algorithm

Jianjun Yuan, Andrew Lamperski

30 Jul 2020OpenReview Archive Direct UploadReaders: Everyone

Abstract: Policy gradient algorithms are useful reinforcement learning methods which optimize a control policy by performing stochastic gradient descent with respect to controller parameters. In this paper, we extend actor-critic algorithms by adding an $\ell_1$ norm regularization on the actor part, which makes our algorithm automatically select and optimize the useful controller basis functions. Our method is closely related to existing approaches to sparse controller design and actuator selection, but in contrast to these, our approach runs online and does not require a plant model. In order to utilize $\ell_1$ regularization online, the actor updates are extended to include an iterative soft-thresholding step. Convergence of the algorithm is proved using methods from stochastic approximation. The effectiveness of our algorithm for control basis and actuator selection is demonstrated on numerical examples.

0 Replies