SD-Fuse: An Image Structure-Driven Model for Multi-Focus Image Fusion

Zeyu Wang

Published: 17 Dec 2025, Last Modified: 27 Jan 2026Information FusionEveryoneCC BY 4.0

Abstract: Multi-focus image fusion (MFIF) aims to generate a fully focused composite from multiple partially focused images. Existing methods often employ complex loss functions or customized network architectures to refine decision map boundaries, overlooking intrinsic structural information. In this study, we empirically uncover an image structure-boundary prior through comprehensive statistical analysis, explicitly demonstrating that boundaries between focused and defocused regions naturally align with prominent structural features of images. Motivated by this structural prior, we propose a structure-driven fusion framework termed SD-Fuse. This framework consists of three complementary components: a global structure-aware branch, a local focus detection branch, and a novel structure-guided filter (SGF). The structure-aware branch first extracts essential structural cues and employs a Transformer module to capture global structural dependencies. Concurrently, the focus detection branch leverages a CNN architecture to generate initial decision maps based on spatial inputs. Crucially, we introduce SGF, inspired by traditional guided filtering methods, to facilitate effective interaction between global and local features. Through optimization within SGF, the refined global structure provided by the Transformer progressively guides the local spatial features, ensuring precise alignment of boundaries and artifact-free decision maps. Extensive qualitative and quantitative experiments demonstrate that our SD-Fuse significantly outperforms existing methods, achieving state-of-the-art performance. The code is available at https://github.com/zhaolb4080/SD-Fuse.