Segment Anything Model Guided Semantic Knowledge Learning For Remote Sensing Change Detection

Zixuan Sun, Huihui Song, Kaihua Zhang, Gang Dong, Lingyan Liang, Yaqian Zhao

Published: 01 Jan 2024, Last Modified: 14 Nov 2024ICASSP 2024EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Existing deep learning based remote sensing change detection (RSCD) methods only rely on binary ground-truth to guide the network learning while neglecting the useful semantic guidance. As a result, the network can be readily misled by irrelevant category changes, leading to degraded performance and slow convergence of the model. To this end, we propose a novel segment anything model (SAM) guided framework, termed as SAM-CD, which mines the rich semantic knowledge from the SAM for RSCD. Specifically, we first employ a transformer encoder to extract multi-scale global features from the bi-temporal images. Meanwhile, we obtain semantic prior masks from the bi-temporal images by providing the SAM with category-relevant text prompts. Then, using the semantic prior masks as constraints, we design a masked attention module (MAM) that generates local features related to the interested categories. Finally, the local and global features are fused and fed into a multi-layer perception (MLP) decoder to obtain the change map. The whole network is trained in an end-to-end manner that can readily encode the rich semantic knowledge of the changed targets to predict an accurate change map. Extensive experiments demonstrate that the proposed SAM-CD achieves state-of-the-art performance on a variety of benchmark datasets.