STMTNet: Spatio-Temporal Multiscale Triad Network for Cropland Change Detection in Remote Sensing Images
Abstract: Cropland change detection in remote sensing is hindered by spatial heterogeneity and temporal noise, leading to misaligned representations and erroneous change identification. We propose the spatio-temporal multiscale triad network (STMTNet), a FastSAM-VGGNet16 dual-stream framework. FastSAM, a lightweight segmentation backbone, efficiently extracts geometric topological features and ridge orientation details of cropland parcels. VGGNet16 captures macroscopic semantic features of crop-growth patterns, enabling complementary geometric–semantic analysis. The geometric-contextual aggregation module (GCAM) employs 3-D spatial parsing and dual-path pooling to mitigate blurred boundaries and texture discontinuities, significantly enhancing cropland boundary detection. The spatiotemporal adaptive gating module (TSAGM) uses dynamic cross-temporal feature weighting to resolve confusion between seasonal fluctuations and permanent land-use changes, suppressing temporal noise, such as cloud cover. The multiscale semantic enhancement module (MSEM) constructs a cross-granularity feature pyramid, integrating microscopic textures and macroscopic patterns. Comparative experiments on four remote sensing image change detection datasets, including three cropland datasets (CLCD, PX-CLCD, and GFSW-CLCD) and one building dataset (LEVIR-CD), against ten state-of-the-art methods demonstrate that STMTNet achieves superior performance, consistently leading in F1 scores and intersection over union values. Ablation studies validate the synergistic contributions of the GCAM, TSAGM, and MSEM modules. STMTNet enables high-precision cropland monitoring, supporting crop yield optimization and sustainable land management.
External IDs:doi:10.1109/jstars.2025.3613578
Loading