Abstract: Multi-modal medical image fusion (MMIF) is used to merge multiple modes of medical images for better imaging quality and more comprehensive information, such that enhancing the reliability of clinical diagnosis. Since different types of medical images have different imaging mechanisms and focus on different pathological tissues, how to accurately fuse the information from various medical images has become an obstacle in image fusion research. In this paper, we propose a self-supervised subspace attentional framework for multi-modal image fusion, which is constructed by two sub-networks, i.e., the feature extract network and the feature fusion network. We implement a self-supervised strategy that facilitates the framework adaptively extracts the features of source images with the reconstruction of the fused image. Specifically, we adopt a subspace attentional Siamese Weighted Auto-Encoder as a feature extractor to extract the source image features including local and global features at first. Then, the extracted features are given into a weighted fusion decoding network to reconstruct the fused result, and the shallow features from the extractor are used to assist reconstruct the fused image. Finally, the feature extractor adaptively extracts the optimal features according to the fused results by simultaneously training the two sub-networks. Furthermore, to achieve better fusion results, we design a novel weight estimation in the weighted fidelity loss that measures the importance of each pixel by calculating a mixture of salient features and local contrast features of the image. Experiments demonstrate that our method gives the best results compared with other state-of-the-art fusion approaches.
External IDs:dblp:journals/air/ZhangNCMW23
Loading