Video prediction by efficient transformers

Published: 2023, Last Modified: 29 Aug 2025Image Vis. Comput. 2023EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Highlights•A new efficient Transformer block for video feature learning is proposed by combining spatial local and temporal attention.•A new family of video prediction Transformers is proposed, which reaches or outperforms complex SOTA ConvLSTM-based models.•It is the first paper that conducts a formal comparison of three different attention-based video prediction variants.
Loading