SCAN: Salience-Guided Cross-Domain Aggregation Network for Multimodal Medical Image Fusion

Published: 2025, Last Modified: 07 Nov 2025IEEE Trans. Instrum. Meas. 2025EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Existing multimodal medical image fusion methods often rely on convolutional neural networks (CNNs) for local feature extraction but fail to model global relationships effectively. Transformer-based approaches address this limitation but are computationally expensive. In this work, we propose the salience-guided cross-domain aggregation network (SCAN) for efficient and high-performance multimodal medical image fusion. SCAN combines the strengths of CNNs and Transformers by introducing a novel nested pyramid residual attention (NPRA) module in the encoder for better local feature extraction and adaptive attention to important regions. We also design a salience-guided dual attention (SGDA) module in the decoder to enhance fused features and preserve fine details. Extensive experiments on three multimodal brain datasets show that SCAN outperforms state-of-the-art methods in both qualitative observation and objective assessment. Future work will explore the scalability of SCAN to other medical imaging tasks and its potential for real-time applications.
Loading