DMM: Disparity-Guided Multispectral Mamba for Oriented Object Detection in Remote Sensing

Minghang Zhou, Tianyu Li, Chaofan Qiao, Dongyu Xie, Guoqing Wang, Ningjuan Ruan, Lin Mei, Yang Yang, Heng Tao Shen

Published: 01 Jan 2025, Last Modified: 21 Nov 2025IEEE Transactions on Geoscience and Remote SensingEveryoneRevisionsCC BY-SA 4.0
Abstract: Multispectral-oriented object detection faces challenges due to both intermodal and intramodal discrepancies. Recent studies often rely on transformer-based models to address these issues and achieve cross-modal fusion detection. However, the quadratic computational complexity of transformers limits their performance in remote sensing imagery. Inspired by the efficiency and lower complexity of Mamba in long-sequence tasks, we propose disparity-guided multispectral Mamba (DMM), a multispectral-oriented object detection framework comprised of a disparity-guided cross-modal fusion Mamba (DCFM) module, a multiscale target-aware attention (MTA) module, and a target-prior aware (TPA) auxiliary task. The DCFM module leverages disparity information between modalities to adaptively merge features from RGB and infrared (IR) images, mitigating intermodal conflicts. The MTA module aims to enhance feature representation by focusing on relevant target regions within the RGB modality, addressing intramodal variations. The TPA auxiliary task utilizes single-modal labels to guide the optimization of the MTA module, ensuring it focuses on targets and their local context. Extensive experiments on the DroneVehicle and VEDAI datasets demonstrate the effectiveness of our method, which outperforms state-of-the-art methods while maintaining computational efficiency. Code will be available at https://github.com/Another-0/DMM
Loading