Structure-Aware Spatial-Temporal Interaction Network for Video Shadow Detection

Housheng Wei, Guanyu Xing, Jingwei Liao, Yanci Zhang, Yanli Liu

Published: 2024, Last Modified: 16 May 2025IJCAI 2024EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Video shadow detection faces significant challenges due to ambiguous semantics and variable shapes. Existing video shadow detection algorithms typically overlook the fine shadow details, resulting in inconsistent detection between consecutive frames in complex real-world video scenarios. To address this issue, we propose a spatial-temporal feature interaction strategy, which refines and enhances global shadow semantics with local prior features in the modeling of shadow relations between frames. Moreover, a structure-aware shadow prediction module is proposed, which focuses on modeling the distance relation between local shadow edges and regions. Quantitative experimental results demonstrate that our approach significantly outperforms the state-of-the-art methods, providing stable and consistent shadow detection results in complex video shadow scenarios.