How are policy gradient methods affected by the limits of control?

Ingvar M. Ziemann, Anastasios Tsiamis, Henrik Sandberg, Nikolai Matni

Published: 01 Jan 2022, Last Modified: 12 May 2023CoRR 2022Readers: Everyone

Abstract: We study stochastic policy gradient methods from the perspective of control-theoretic limitations. Our main result is that ill-conditioned linear systems in the sense of Doyle inevitably lead to noisy gradient estimates. We also give an example of a class of stable systems in which policy gradient methods suffer from the curse of dimensionality. Our results apply to both state feedback and partially observed systems.

0 Replies