Fusion2Void: Unsupervised Multi-Focus Image Fusion Based on Image Inpainting

Huangxing Lin, Yunlong Lin, Jingyuan Xia, Linyu Fan, Feifei Li, Yingying Wang, Xinghao Ding

Published: 01 Jan 2025, Last Modified: 27 Jul 2025IEEE Trans. Circuits Syst. Video Technol. 2025EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Multi-focus image fusion aims to integrate clear segments from different partially focused images, creating an ‘all-in-focus’ composite. Due to the lack of ground-truth for multi-focus image fusion, supervised deep learning methods are deemed inappropriate for this task. In this paper, we present an unsupervised approach for multi-focus image fusion, named Fusion2Void. Fusion2Void ingeniously tackles the challenge of missing ground-truth by framing image inpainting as an auxiliary task. Specifically, Fusion2Void utilizes a fusion network to merge focused regions from multiple source images. Following the fusion process, image patches in the source images are randomly dropped to construct an additional image inpainting task. Subsequently, an image inpainting network uses the fused image as a guide to restore the missing content in the source images. The missing content in the source images includes both focused and defocused regions. Restoring focused image patches is significantly more challenging than restoring their defocused counterparts due to their inclusion of more high-frequency details. If the focused image patches are effectively restored, the repair of the defocused image patches becomes notably easier. Therefore, the image inpainting network implicitly compels the fused image to incorporate all focused content from the source images, as these can be utilized to restore the missing focused regions in the source images perfectly. Based on image inpainting, the fusion network generates ‘all-in-focus’ images in an unsupervised manner. Experiments on several synthetic and real-world datasets highlight Fusion2Void’s state-of-the-art performance relative to other methods.