The Gaussian Process Prior VAE for Interpretable Latent Dynamics from Pixels

Michael Arthur Leopold Pearce

The Gaussian Process Prior VAE for Interpretable Latent Dynamics from Pixels

Michael Arthur Leopold Pearce

16 Oct 2019 (modified: 05 May 2023)AABI 2019Readers: Everyone

Keywords: Gaussian Processes, Amortised Inference

TL;DR: We learn sohpisticated trajectories of an object purely from pixels with a toy video dataset by using a VAE structure with a Gaussian process prior.

Abstract: We consider the problem of unsupervised learning of a low dimensional, interpretable, latent state of a video containing a moving object. The problem of distilling dynamics from pixels has been extensively considered through the lens of graphical/state space models that exploit Markov structure for cheap computation and structured graphical model priors for enforcing interpretability on latent representations. We take a step towards extending these approaches by discarding the Markov structure; instead, repurposing the recently proposed Gaussian Process Prior Variational Autoencoder for learning sophisticated latent trajectories. We describe the model and perform experiments on a synthetic dataset and see that the model reliably reconstructs smooth dynamics exhibiting U-turns and loops. We also observe that this model may be trained without any beta-annealing or freeze-thaw of training parameters. Training is performed purely end-to-end on the unmodified evidence lower bound objective. This is in contrast to previous works, albeit for slightly different use cases, where application specific training tricks are often required.

0 Replies

Loading