Toward Understanding Latent Model Learning in MuZero: A Case Study in Linear Quadratic Gaussian Control

Published: 19 Jun 2023, Last Modified: 24 Jul 2023Frontiers4LCDEveryoneRevisionsBibTeX
Keywords: Latent model learning, representation learning for control, linear quadratic Gaussian (LQG)
TL;DR: We show that cost-driven, direct latent model learning such as MuZero provably solves Linear Quadratic Gaussian control using a single trajectory.
Abstract: We study the problem of representation learning for control from partial and potentially high-dimensional observations. We approach this problem via direct latent model learning, where one directly learns a dynamical model in some latent state space by predicting costs. In particular, we establish finite-sample guarantees of finding a near-optimal representation function and a near-optimal controller using the directly learned latent model for infinite-horizon time-invariant Linear Quadratic Gaussian (LQG) control. A part of our approach to latent model learning closely resembles MuZero, a recent breakthrough in empirical reinforcement learning, in that it learns latent dynamics implicitly by predicting cumulative costs. A key technical contribution of this work is to prove persistency of excitation for a new stochastic process that arises from our analysis of quadratic regression in our approach.
Submission Number: 124
Loading