Abstract: Accurate segmentation of multimodal medical images with missing modalities remains a critical challenge due to incomplete data often encountered in clinical practice. Lack of modality-specific information often leads to significant performance degradation in scenarios with severely missing modalities. To address this problem, we focus on modeling the relationships between modality-specific features. We propose a joint representation learning framework, named as Cyclic Contrastive Latent Representation Segmentation (CLRS), which incorporates cyclic modality-specific representation generation and contrastive feature alignment for robust 3D medical image segmentation under missing modality conditions. CLRS first extracts feature from available modalities using a unified encoder, then generates missing latent representations conditioned on the encoded features via an elaborately designed synthesis strategy. Meanwhile, a channel-wise attention mechanism is introduced to enhance the specific features of the modality. In addition, modality-specific contrastive learning enforces cross-modal discrimination between the generated and encoded representations, which effectively disentangles modality-specific information from shared patterns and enhances the segmentation robustness in missing modality scenarios. Extensive experiments on three 3D multimodal datasets demonstrate the superior performance of CLRS, particularly in scenarios with severe modality absence. For instance, with only a single modality available on ProstateZS dataset, CLRS improves the state-of-the-art (SOTA) by over 4.06% for peripheral zone, 2.20% for central gland. The code is available at https://github.com/comphsh/CLRS.
External IDs:doi:10.1109/jbhi.2025.3637570
Loading