Keyframing the Future: Discovering Temporal Hierarchy with Keyframe-Inpainter Prediction

Karl Pertsch; Oleh Rybkin; Jingyun Yang; Konstantinos G. Derpanis; Kostas Daniilidis; Joseph J. Lim; Andrew Jaegle

Keyframing the Future: Discovering Temporal Hierarchy with Keyframe-Inpainter Prediction

Karl Pertsch, Oleh Rybkin, Jingyun Yang, Konstantinos G. Derpanis, Kostas Daniilidis, Joseph J. Lim, Andrew Jaegle

25 Sept 2019 (modified: 05 May 2023)ICLR 2020 Conference Blind SubmissionReaders: Everyone

Keywords: representation learning, variational inference, video generation, temporal hierarchy

TL;DR: We propose a model that learns to discover informative frames in a future video sequence and represent the video via its keyframes.

Abstract: To flexibly and efficiently reason about temporal sequences, abstract representations that compactly represent the important information in the sequence are needed. One way of constructing such representations is by focusing on the important events in a sequence. In this paper, we propose a model that learns both to discover such key events (or keyframes) as well as to represent the sequence in terms of them. We do so using a hierarchical Keyframe-Inpainter (KeyIn) model that first generates keyframes and their temporal placement and then inpaints the sequences between keyframes. We propose a fully differentiable formulation for efficiently learning the keyframe placement. We show that KeyIn finds informative keyframes in several datasets with diverse dynamics. When evaluated on a planning task, KeyIn outperforms other recent proposals for learning hierarchical representations.

Original Pdf: pdf

11 Replies

Loading