Abstract: Contemporary works in multi-modal image fusion often excessively rely on aligned source images, resulting in limited practicality when encountering misaligned data. However, there is still a significant gap in developing effective multi-modal image registration methods to address this problem. Moreover, existing multi-modal image registration models are largely restricted to specific types of multi-modal data, lacking a general model applicable to diverse multi-modal data types. To address th above issues, this study introduces a novel method named PGMR, which stands as the first plug-and-play general multi-modal image registration model. PGMR comprises three components: Modality Prompt Module (MPM), Universal Registration Framework (URF), and Detail Enhancement Module (DEM). URF serves as the fundamental registration framework, handling both rigid and non-rigid deformations to achieve basic multi-modal image registration. MPM, one core component of this paper, is embedded within URF. Leveraging prompt learning, MPM dynamically integrates modality prompts into the intermediate output of URF, not only alleviating modality discrepancies but also promoting the ability of the registration model across various multi-modal data types. DEM is a detail enhancement module for multi-modal image registration. It can enrich the details of registration results, thereby enhancing the effectiveness of subsequent tasks. We evaluate the performance of PGMR on four multi-modal types and extensive experiments validate the feasibility of PGMR, demonstrating the superiority of our method compared to state-of-the-art alternatives. The code will be available at https://github.com/stwts/PGMR
External IDs:dblp:journals/tcsv/ZhengDZHR25
Loading