Multimodal Remote Sensing Image Clustering with Multi-scale Spectral-Spatial Anchor Graphs

Xinxin Wang

Published: 24 Feb 2025, Last Modified: 07 Mar 2025OpenReview Archive Direct UploadEveryoneCC BY-NC-ND 4.0

Abstract: Existing multi-view clustering methods have achieved remarkable success for general images, but still have many limitations for clustering multimodal remote sensing images (RSIs). For example, these methods are sensitive to noise and spectral variability, ignore the diverse spatial structure information across modalities, or are computationally prohibitive for large-scale RSIs, thereby limiting their applications. This paper proposes a multi-scale spectral-spatial anchor graph fusion (MSSAGF) method for multimodal remote sensing image clustering. MSSAGF develops a superpixel-based nonlinear neighborhood recovery strategy to reduce noise while enhancing the spatial smoothness within multimodal remote sensing images. Using spatial-aware anchors to extract local spatial information for each modality, MSSAGF introduces multiscale local spectralspatial anchor graphs to capture nonlinear correlations between the pixels and their corresponding local regions. A small number of anchors effectively reduces graph construction and partitioning costs, making the time complexity of MSSAGF nearly linear. This ensures it is computationally feasible for large-scale RSIs. Finally, MSSAGF develops an adaptive fusion mechanism to fuse multiscale local anchor graphs into a unified global anchor graph, integrating complementary information across multiple modalities while directly obtaining the final clustering results. The experimental results on three multimodal RSIs datasets demonstrate the superiority of our proposed method over state-of-the-art methods. Our code is publicly available at https://github.com/W-Xinxin/MSSAGF.