High Fidelity Video Prediction with Large Stochastic Recurrent Neural NetworksDownload PDF

Ruben Villegas, Arkanath Pathak, Harini Kannan, Honglak Lee, Dumitru Erhan, Quoc Le

06 Sept 2019 (modified: 05 May 2023)NeurIPS 2019Readers: Everyone
Abstract: Predicting future video frames is extremely challenging, as there are many factors of variation that make up the dynamics of how frames change through time. Previously proposed solutions require complex network architectures and highly specialized computation, including segmentation masks, optical flow, and foreground and background separation. In this work, we question if such handcrafted architectures are necessary and instead propose a different approach: maximizing the capacity of a neural network without such specialized layers. We perform the first large-scale empirical study of the effect of capacity on video prediction models. We also investigate the importance of recurrent connections and modeling stochasticity. We experimentally demonstrate our results on three different datasets: one for modeling object interactions, one for modeling human motion, and one for modeling car driving.
CMT Num: 44
0 Replies

Loading