Co-Enhancement of Multi-Modality Image Fusion and Object Detection via Feature Adaptation

Aimei Dong, Long Wang, Jian Liu, Jingyuan Xu, Guixin Zhao, Yi Zhai, Guohua Lv, Jinyong Cheng

Published: 2024, Last Modified: 13 May 2025IEEE Trans. Circuits Syst. Video Technol. 2024EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: The integration of multi-modality images significantly enhances the clarity of critical details for object detection. Valuable semantic data from object detection enriches the fusion process of these images. However, the potential reciprocal relationship that could enhance their mutual performance remains largely unexplored and underutilized, despite some semantic-driven fusion methodologies catering to specific application needs. To address these limitations, this study proposes a mutually reinforcing, dual-task-driven fusion architecture. Specifically, our design integrates a feature-adaptive interlinking module into both image fusion and object detection components, effectively managing the inherent feature discrepancies. The core idea is to channel distinct features from both tasks into a unified feature space after feature transformation. We then design a feature-adaptive selection module to generate features rich in target semantic information and compatible with the fusion network. Finally, effective combination and mutual enhancement of the two tasks are achieved through an alternating training process. A diverse range of swift evaluations is performed across various datasets to corroborate the potential efficiency of our framework, actualizing visible advancements in both fusion effectiveness and detection accuracy.