Effectiveness Guided Cross-Modal Information Sharing for Aligned RGB-T Object Detection

Zijia An, Chunlei Liu, Yuqi Han

Published: 2022, Last Modified: 17 Apr 2025IEEE Signal Process. Lett. 2022EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Integrating multi-modal data can significantly increase detection performance in a complex scene by introducing additional targets' information. However, most of the existing multi-modal detectors separately extract the features from the respective modalities without regarding the correlation between the modalities. Considering the spatial correlation across different modalities for aligned multi-modal data, we attempt to exploit such correlation to share target's information across different modalities, thereby enhancing the targets' feature representation capability. To this end, in this letter, we propose an Effectiveness Guided Cross-Modal Information Sharing Network (ECISNet) for aligned multi-modal data, which can still accurately detect objects when a modality fails. Specifically, the Cross-Modal Information Sharing (CIS) module is proposed to enhance the feature extraction capability by sharing information about targets across different modalities. Afterward, considering that the failed modality may interfere with other modalities when sharing information, we designed a Modal Effectiveness Guiding (MEG) module that guides the CIS module to exclude the interference of failed modalities. Extensive experiments on three latest multi-modal detection datasets demonstrate that ECISNet outperforms relevant state-of-the-art detection algorithms.