How are policy gradient methods affected by the limits of control?Download PDFOpen Website

Published: 01 Jan 2022, Last Modified: 12 May 2023CoRR 2022Readers: Everyone
Abstract: We study stochastic policy gradient methods from the perspective of control-theoretic limitations. Our main result is that ill-conditioned linear systems in the sense of Doyle inevitably lead to noisy gradient estimates. We also give an example of a class of stable systems in which policy gradient methods suffer from the curse of dimensionality. Our results apply to both state feedback and partially observed systems.
0 Replies

Loading