Stochastic Latent Residual Video Prediction

Jean-Yves Franceschi; Edouard Delasalles; Mickael Chen; Sylvain Lamprier; Patrick Gallinari

Stochastic Latent Residual Video Prediction

Jean-Yves Franceschi, Edouard Delasalles, Mickael Chen, Sylvain Lamprier, Patrick Gallinari

25 Sept 2019 (modified: 22 Jun 2025)ICLR 2020 Conference Blind SubmissionReaders: Everyone

Abstract: Video prediction is a challenging task: models have to account for the inherent uncertainty of the future. Most works in the literature are based on stochastic image-autoregressive recurrent networks, raising several performance and applicability issues. An alternative is to use fully latent temporal models which untie frame synthesis and dynamics. However, no such model for video prediction has been proposed in the literature yet, due to design and training difficulties. In this paper, we overcome these difficulties by introducing a novel stochastic temporal model. It is based on residual updates of a latent state, motivated by discretization schemes of differential equations. This first-order principle naturally models video dynamics as it allows our simpler, lightweight, interpretable, latent model to outperform prior state-of-the-art methods on challenging datasets.

Code: https://sites.google.com/view/srvp/

Keywords: stochastic video prediction, variational autoencoder, residual dynamics

Community Implementations: [![CatalyzeX](/images/catalyzex_icon.svg) 2 code implementations](https://www.catalyzex.com/paper/stochastic-latent-residual-video-prediction/code)

Original Pdf: pdf

9 Replies

Loading