Abstract: Highlights•Design a hierarchical spatiotemporal feature interaction network for video saliency prediction.•Multi-scale feature integration unit is proposed to interact with spatiotemporal features.•Hierarchical enhancement module (TFE, CFE) is proposed to strengthen the fused features at different levels.
Loading