Multi-granularity siamese transformer-based change detection in remote sensing imagery

Published: 01 Jan 2024, Last Modified: 15 May 2025Eng. Appl. Artif. Intell. 2024EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: In recent years, Convolutional Neural Networks (CNNs) have promoted the prosperity of Change Detection (CD). However, due to the intrinsic property of convolution kernel, this method cannot effectively model the long-distance dependency. The emergence of Vision Transformer (ViT) brings a new way to solve the problem. Based on ViT architecture, a novel Multi-Granularity remote sensing image Change Detection model (MGCDT) is proposed in this paper. We cascade several Local–Global Siamese Transformer (LGST) as backbone to extract local and global semantic discriminative features. In order to solve the serious problem of false detection and missing detection of feature boundary, a plug-and-play High Frequency Enhancement Unit (HFE) is proposed to replace the inflexible U-shaped structure to optimize the detection boundary. Considering the problem of multi-scale modeling of ground objects, a Multi-Scale Fusion Attention Unit (MSFA) is proposed, which integrates the flow of multi-scale information into the calculation process of self-attention. Finally, we utilize a Deep Feature Guidance Unit (DFG) to optimize the shallow detailed feature information. Extensive experiments show that, considering multi-granularity information, MGCDT outperforms the existing change detection algorithms on four remote sensing image change detection datasets.
Loading