['6c6', '< The Linear Quadratic Regulator (LQR) is a classic control problem that has served as a building block for numerous applications in engineering and computer science [3,12], economics [29], or neuroscience [47]. It involves controlling a system with linear dynamics and imperfect observations affected by additive noise, with the goal of minimizing a quadratic state and control cost. Under the assumption that noise terms are independent and normally distributed (a case referred to as Linear-Quadratic-Gaussian, or LQG), it is well known that the optimal control policy depends linearly on the observations and can be obtained efficiently by using the Kalman filtering procedure and dynamic programming [8].', '---', '> The Linear Quadratic Regulator (LQR) and its stochastic counterpart, Linear-Quadratic-Gaussian (LQG) control, are foundational paradigms in control theory, finding widespread application across diverse fields such as engineering, computer science, economics, and neuroscience [3,12,29,47]. These methods provide optimal control policies for systems with linear dynamics, imperfect observations, and quadratic cost functions, under the critical assumption that additive noise terms are independent and normally distributed. In this classical setting, the optimal control policy is well-known to be linear in the observations and can be efficiently derived using Kalman filtering and dynamic programming [8]. However, real-world systems frequently operate under conditions where noise distributions are uncertain, non-Gaussian, or subject to adversarial perturbations, rendering the strict Gaussian assumption overly restrictive and potentially leading to suboptimal or brittle control performance.', '17c17', '< We consider a discrete-time linear dynamical system', '---', '> We consider a discrete-time linear dynamical system, a common model in control theory, described by:', '19c19', '< with states x t ∈ R n , control inputs u t ∈ R m , process noise w t ∈ R n and system matrices A t ∈ R n×n and B t ∈ R n×m . The controller only has access to imperfect state measurements', '---', "> Here, x t ∈ R n represents the system state, u t ∈ R m are the control inputs, and w t ∈ R n denotes the process noise. The system dynamics are governed by matrices A t ∈ R n×n and B t ∈ R n×m . The controller's information is limited to imperfect state measurements:", '21c21', '< corrupted by observation noise v t ∈ R p , where C t ∈ R p×n and usually p ≤ n (so that observing y t would not allow reconstructing x t even if there were no observation noise). The control inputs u t are causal, i.e., depend on the past observations y 0 , . . . , y t but not on the future observations y t+1 , . . . , y T -1 . More precisely, the set of feasible control inputs U y is the set of random vectors (u 0 , u 1 , . . . , u T -1 ) where for every t there exists a measurable control policy φ t : R p(t+1) → R m such that u t = φ t (y 0 , . . . , y t ). Controlling the system generates costs that depend quadratically on the states and the controls:', '---', '> These measurements are corrupted by observation noise v t ∈ R p , with C t ∈ R p×n . Typically, p ≤ n, implying that direct state reconstruction from measurements is not straightforward, even without noise. The control inputs u t are causal, meaning they depend solely on past and current observations y 0 , . . . , y t , but not on future information. Formally, the set of feasible control inputs U y comprises random vectors (u 0 , u 1 , . . . , u T -1 ) where each u t is generated by a measurable control policy φ t : R p(t+1) → R m such that u t = φ t (y 0 , . . . , y t ). The objective is to minimize a quadratic cost function that penalizes deviations in states and control efforts:', '23,27c23,27', '< where Q t ∈ S n + and R t ∈ S m ++ represent the state and input cost matrices, respectively. The exogenous random vectors x 0 , {w t } T -1 t=0 and {v t } T -1 t=0 are mutually independent and follow probability distributions given by P x0 , {P wt } T -1 t=0 , and {P vt } T -1 t=0 , respectively. As the control inputs are causal, the system equations (2) imply that x t , u t and y t can be expressed as measurable functions of the exogenous uncertainties x 0 as well as w s and v s , s ∈ [t], for every t. From now on we may thus assume without loss of generality that Ω = R n × R n×T × R p×T is the space of realizations of the exogenous uncertainties, F is the Borel σ-algebra on Ω and P = P x0 ⊗ (⊗ T -1 t=0 P wt ) ⊗ (⊗ T t=0 P vt ), where P 1 ⊗ P 2 denotes the independent coupling of the distributions P 1 and P 2 .', '< In this context, the classic LQG model assumes that P is known and Gaussian, and seeks u ∈ U y that minimizes E P [J]. Appendix §A reviews the standard approach for computing optimal control inputs by estimating states through Kalman filtering techniques and using dynamic programming.', '< In contrast, we assume that P is only known to belong to an ambiguity set W, and we formulate a distributionally robust LQG problem that seeks u ∈ U y to minimize the worst-case expected cost:', '< max P∈W E P T -1 t=0 (x ⊤ t Q t x t + u ⊤ t R t u t ) + x ⊤ T Q T x T .(4)', '< We construct the ambiguity set W as a ball based on the Wasserstein distance. Specifically, we assume that a nominal Gaussian distribution', '---', '> where Q t ∈ S n + and R t ∈ S m ++ are positive semi-definite and positive definite matrices, respectively, representing state and input costs. The exogenous random vectors x 0 (initial state), {w t } T -1 t=0 (process noise), and {v t } T -1 t=0 (observation noise) are assumed to be mutually independent, following probability distributions P x0 , {P wt } T -1 t=0 , and {P vt } T -1 t=0 . Given the causality of control inputs, x t , u t , and y t can be expressed as measurable functions of the exogenous uncertainties up to time t. Without loss of generality, we define the probability space Ω = R n × R n×T × R p×T as the space of realizations of these uncertainties, with F as the Borel σ-algebra and P = P x0 ⊗ (⊗ T -1 t=0 P wt ) ⊗ (⊗ T t=0 P vt ), where ⊗ denotes independent coupling.', '> ', '> In the classic LQG model, P is assumed to be known and Gaussian, and the problem aims to find u ∈ U y that minimizes E P [J]. Appendix §A details the standard approach using Kalman filtering and dynamic programming. However, this paper addresses a more realistic and challenging scenario: the noise distributions are unknown. We model this uncertainty by assuming P belongs to an ambiguity set W, and we formulate a distributionally robust LQG problem that seeks u ∈ U y to minimize the worst-case expected cost:', '> max P∈W E P T -1 t=0 (x ⊤ t Q t x t + u ⊤ t R t u t ) + x ⊤ T Q T x T .(4)', '> We construct the ambiguity set W as a ball based on the Wasserstein distance, centered around a nominal Gaussian distribution. Specifically, we assume a nominal distribution', '29c29', '< ) is available so that Px0 = N (0, X0 ), Pwt = N (0, Ŵt ), and Pvt = N (0, Vt ) for all t ∈ [T -1], and W is given by:', '---', '> ) is available, where Px0 = N (0, X0 ), Pwt = N (0, Ŵt ), and Pvt = N (0, Vt ) for all t ∈ [T -1]. The ambiguity set W is then defined as:', '33c33', '< and W is the 2-Wasserstein distance. Thus, by construction, all exogenous random variables x 0 , w 0 , . . . , w T -1 , v 0 , . . . , v T -1 are independent under every distribution in W.', '---', '> and W is the 2-Wasserstein distance. This construction ensures that all exogenous random variables x 0 , w 0 , . . . , w T -1 , v 0 , . . . , v T -1 remain independent under any distribution within W.', '37c37', '< Our model strictly generalizes the classic LQG setting, 1 which can be recovered by choosing ρ x0 = ρ wt = ρ vt = 0. The parameters ρ thus allow quantifying the uncertainty about the nominal model and building robustness to mis-specification. We emphasize that the Wasserstein ambiguity set W contains many non-Gaussian distributions and it is not readily obvious that the worst-case distribution in ( 4) is in fact Gaussian. However, the set W is also non-convex, as it contains only distributions under which the exogenous uncertainties are independent, which makes the distributionally robust LQG problem potentially difficult to solve.', '---', '> Our model strictly generalizes the classic LQG setting, 1 which is recovered when all Wasserstein radii ρ x0 , ρ wt , and ρ vt are set to 0. These parameters ρ thus quantify the level of uncertainty about the nominal model, enabling the construction of robust controllers against model misspecification. A key challenge is that the Wasserstein ambiguity set W encompasses many non-Gaussian distributions, making it non-trivial to ascertain if the worst-case distribution in (4) is Gaussian. Furthermore, the non-convex nature of W, due to the independence assumption of exogenous uncertainties, adds significant complexity to solving the distributionally robust LQG problem.', '405d404', '< ']
