STA3D: Spatiotemporally attentive 3D network for video saliency prediction

Wenbin Zou, Shengkai Zhuo, Yi Tang, Shishun Tian, Xia Li, Chen Xu

Published: 01 Jan 2021, Last Modified: 08 Mar 2025Pattern Recognit. Lett. 2021EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Highlights•Attention guiding is significant for video saliency prediction based on 3D CNN.•A spatiotemporally attentive 3D CNN for robust video saliency prediction is proposed.•An adaptive upsampling module for refining spatial features is proposed.•A frame-wise attention module for propagating temporal features is proposed.•The effectiveness of the proposed method is comprehensively evaluated.