Optimality of Linear Policies in Distributionally Robust Linear Quadratic Control

Bahar Taskesen; Dan Andrei Iancu; Çağıl Koçyiğit; Daniel Kuhn

Optimality of Linear Policies in Distributionally Robust Linear Quadratic Control

Bahar Taskesen, Dan Andrei Iancu, Çağıl Koçyiğit, Daniel Kuhn

Published: 28 Nov 2025, Last Modified: 30 Nov 2025NeurIPS 2025 Workshop MLxOREveryoneRevisionsBibTeXCC BY 4.0

Keywords: distributionally robust optimization, linear-quadratic-Gaussian control, linear policies

Abstract: We study a generalization of the classical discrete-time, Linear-Quadratic-Gaussian (LQG) control problem where the noise distributions affecting the states and observations are unknown and chosen adversarially from divergence-based ambiguity sets centered around a known nominal distribution. For a finite horizon model with Gaussian nominal noise and a structural assumption on the divergence that is satisfied by many examples -- including 2-Wasserstein distance, Kullback-Leibler divergence, moment-based divergences, entropy-regularized optimal transport, or Fisher (score-matching) divergence -- we prove that a control policy that is \emph{affine} in the observations is optimal and the adversary's corresponding worst-case optimal distribution is Gaussian. When the nominal means are zero (as in the classical LQG model), we show that the adversary should optimally set the distribution's mean to zero and the optimal control policy becomes \emph{linear}. Moreover, the adversary should optimally ``inflate" the noise by choosing covariance matrices that dominate the nominal covariance in Loewner order. Exploiting these structural properties, we develop a Frank-Wolfe algorithm whose inner step solves standard LQG subproblems via Kalman filtering and dynamic programming and show that the implementation consistently outperforms semidefinite-programming reformulations of the problem. We then extend all structural results to an infinite-horizon, average-cost formulation, where we prove that \emph{stationary} linear policies are optimal for the decision maker and \emph{time-invariant}, Gaussian distributions are optimal for the adversary. Lastly, when the divergence is 2-Wasserstein, we show that the entire framework remains valid is nominal distributions are elliptical rather than Gaussian.

Submission Number: 65

Loading