BusReF: Infrared-Visible Images Registration and Fusion Focus on Reconstructible Area Using One Set of Features

Zeyang Zhang, Hui Li, Tianyang Xu, Congcong Bian, Xiaojun Wu, Josef Kittler

Published: 30 Oct 2025, Last Modified: 09 Nov 2025ACM Transactions on Multimedia Computing, Communications, and ApplicationsEveryoneRevisionsCC BY-SA 4.0

Abstract: In multi-modal imaging scenarios, the misalignment of images presents a persistent challenge. Conventional image fusion algorithms, aiming to enhance the performance of downstream vision tasks, presuppose strictly registered inputs to achieve satisfactory results. To relax this assumption, a common approach is to register the images first; however, existing multi-modal registration methods are often hindered by complex architectures and a heavy reliance on semantic information. This paper proposes BusRef, a unified framework that jointly addresses image registration and fusion, with a specific focus on the Infrared-Visible Image Registration and Fusion (IVRF) task. Within this framework, unaligned image pairs are processed through three sequential stages: coarse registration, fine registration, and fusion. We demonstrate that this integrated approach enables more robust and accurate IVRF. Key to our framework is a novel training and evaluation strategy that employs masks to mitigate the influence of non-reconstructible regions on the loss function, thereby significantly improving the model’s accuracy and robustness. Furthermore, we introduce a gradient-aware fusion network designed to effectively preserve complementary information from both modalities. Comprehensive experiments demonstrate that BusRef achieves superior performance when compared against various state-of-the-art registration and fusion algorithms. Our code is available at https://github.com/Yukarizz/BusReF.

External IDs:doi:10.1145/3773769