Collaborative spatial-temporal video salient object detection with cross attention transformer

Published: 01 Jan 2024, Last Modified: 29 Sept 2024Signal Process. 2024EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Highlights•Siamese feature extractor is proposed to jointly extract static and motion features.•Deep level set method is utilized to fix the semantic gap.•Cross-attention transformer is proposed to refine and fuse static and motion features.
Loading