Stochastic Video Prediction with Perceptual Loss

Donghun LEE; Ingook Jang; SEONGHYUN KIM; Chanwon Park; Junhee Park

Stochastic Video Prediction with Perceptual Loss

Donghun LEE, Ingook Jang, SEONGHYUN KIM, Chanwon Park, Junhee Park

Published: 08 Dec 2021, Last Modified: 05 May 2023DGMs and Applications @ NeurIPS 2021 PosterReaders: Everyone

Keywords: Video Prediction, Variational autoencoder, Perceptual loss

TL;DR: In this paper, we propose stochastic video generation with perceptual loss (SVG-PL) to improve uncertainty and blurred are in future prediction.

Abstract: Predicting future states is a challenging process in the decision-making system because of its inherently uncertain nature. Most works in this literature are based on deep generative networks such as variational autoencoder which uses pixel-wise reconstruction in their loss functions. Predicting the future with pixel-wise reconstruction could fail to capture the full distribution of high-level representations and result in inaccurate and blurred predictions. In this paper, we propose stochastic video generation with perceptual loss (SVG-PL) to improve uncertainty and blurred area in future prediction. The proposed model combines perceptual loss function and pixel-wise loss function for image reconstruction and future state predictions. The model is built on a variational autoencoder to reduce high dimensionality to latent variable to capture both spatial information and temporal dynamics of future prediction. We show that utilization of perceptual loss on video prediction improves reconstruction ability and result in clear predictions. Improvements in video prediction could further help the decision-making process in multiple downstream applications.

1 Reply

Loading