Modal disentangled generative adversarial networks for bidirectional magnetic resonance image synthesis

Published: 01 Jan 2025, Last Modified: 18 May 2025Eng. Appl. Artif. Intell. 2025EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Magnetic resonance imaging (MRI) is commonly used both in clinical diagnosis and scientific research. Owing to the high cost, time constraints, and limited application of multi-contrast MRI images obtained from metallic implants, it incurs low throughput and misses a specific modality. Medical image cross-modal synthesis based on Artificial Intelligence (AI) technologies is proposed to synthesize the desired missing modal images. However, it still suffers from low expandability, invisible latent representations, and poor interpretability. We thus propose modal disentanglement generative adversarial networks for bidirectional T1-weighted (T1-w) and T1-weighted (T2-w) medical image synthesis with controllable cross-modal synthesis and disentangled interpretability. Firstly, we construct a cross-modal synthesis model to achieve bidirectional generation between T1-w and T2-w MRI images, which can be easily extended for adaptive modality synthesis without training multiple generators and discriminators. Then, we use an easily trained deep network to disentangle deep representations in latent space and map representations in latent space into pixel space to visualize morphological images and yield multi-contrast MRI images with controllable feature generation. Besides, we construct an easy-to-interpret deep structure by incorporating morphology consistency to preserve edge contours and visualize deep representations in latent space to enable interpretability, which is critical for artificial intelligence oriented to engineering applications and clinical diagnostics. The experiments demonstrate that ours outperforms recent state-of-the-art methods with average improvements of 15.8% structural similarity (SSIM), 12.7% multiscale structural similarity (MSIM), 38.2% peak signal-to-noise ratio (PSNR) and 5.2% visual information fidelity (VIF) on benchmark datasets.
Loading