MSFMamba: Multiscale Feature Fusion State Space Model for Multisource Remote Sensing Image Classification

Feng Gao, Xuepeng Jin, Xiaowei Zhou, Junyu Dong, Qian Du

Published: 01 Jan 2025, Last Modified: 13 Jan 2026IEEE Transactions on Geoscience and Remote SensingEveryoneRevisionsCC BY-SA 4.0
Abstract: In the field of multisource remote sensing image classification, remarkable progress has been made by using the convolutional neural network (CNN) and Transformer. While CNNs are constrained by their local receptive fields, Transformers mitigate this issue with their global attention mechanism. However, Transformers come with the tradeoff of higher computational complexity. Recently, Mamba-based methods built upon the state space model (SSM) have shown great potential for long-range dependence modeling with linear complexity, but they have rarely been explored for multisource remote sensing image classification tasks. To address this issue, we propose the Multi-Scale Feature Fusion Mamba (MSFMamba) network, a novel framework designed for the joint classification of hyperspectral image (HSI) and light detection and ranging (LiDAR)/synthetic aperture radar (SAR) data. The MSFMamba network is composed of three key components: the Multi-Scale Spatial Mamba (MSpa-Mamba) block, the Spectral Mamba (Spe-Mamba) block, and the fusion Mamba (Fus-Mamba) block. The MSpa-Mamba block employs a multiscale strategy to reduce computational cost and alleviate feature redundancy in multiple scanning routes, ensuring efficient spatial feature modeling. The Spe-Mamba block focuses on spectral feature extraction, addressing the unique challenges of HSI data representation. Finally, the Fus-Mamba block bridges the heterogeneous gap between HSI and LiDAR/SAR data by extending the original Mamba architecture to accommodate dual inputs, enhancing cross-modal feature interactions and enabling seamless data fusion. Together, these components enable MSFMamba to effectively tackle the challenges of multisource data classification, delivering improved performance with optimized computational efficiency. Comprehensive experiments on four real-world multisource remote sensing datasets (Berlin, Augsburg, Houston2018, and Houston2013) demonstrate the superiority of MSFMamba outperforms several state-of-the-art methods and achieves overall accuracies of 76.92%, 91.38%, 92.38%, and 92.86%, respectively. The source codes of MSFMamba will be publicly available at https://github.com/oucailab/MSFMamba.
Loading