Bipartite Spatial Transformer Difference Network for Unsupervised Change Detection in Misaligned Images

Ling Hu, Qichao Liu, Jia Liu, Zhihui Wei, Liang Xiao

Published: 01 Jan 2025, Last Modified: 15 Nov 2025IEEE Transactions on Geoscience and Remote SensingEveryoneRevisionsCC BY-SA 4.0

Abstract: Unsupervised change detection (CD) algorithms typically identify changed areas by comparing pixels or ground objects in co-registered bi-temporal images. However, due to small viewpoint variations in misaligned images and unknown changed regions between bi-temporal images, image registration becomes extremely challenging, and misaligned pixels may lead to a mass of false alarms. To address this issue, we develop a bidirectional spatial transformer network (BidSTN) capable of identifying the spatial shifts of corresponding key points in pre-event and postevent images. It allows us to compute the coordinate offset relationships between misaligned images with small viewpoint variations and achieve spatial interpolation and registration using a thin-plate spline function. Building on the BidSTN, the bipartite spatial transformer difference network (BSTDN) is constructed by integrating encoders and decoders with feature difference minimization and cross-reconstruction constraints. It concurrently learns the distribution-consistent features of bi-temporal images in the feature space and their coordinate offset relationships, enabling the spatial registration and CD of heterogeneous images within a single learning process. Furthermore, we incorporate a series of iterative refining modules (rms) based on the aligned images and deep features. These modules enhance the accuracy of CD by learning real changes within two distinct feature spaces and continuously refining the change map. Finally, by cascading the BSTDN and rms, an unsupervised cross-modal registration-detection joint framework is constructed. Extensive experiments on both real and simulated datasets validate the framework’s superiority in suppressing misalignment artifacts, with ${F}1$ scores improving by an average of 15.83% (real dataset) and 11.12% (simulated dataset) compared to the best baseline methods. The code is available at https://github.com/lh-rs/BSTDN

External IDs:doi:10.1109/tgrs.2025.3600456