SiMAE: Subject-identity Separation Latent Masked Autoencoder for Multi-contrast MRI Synthesis and Uncertainty Estimation
Keywords: multi-contrast MRI synthesis, tokenizer, latent masked autoencoder, uncertainty estimation
Abstract: Multi-contrast magnetic resonance imaging (MRI) provides complementary anatomical and pathological information, yet certain contrasts are often missing due to scan time, motion artifacts, or protocol variability. We present SiMAE, a masked autoencoder (MAE) operating in latent space that synthesizes arbitrary missing contrasts. MAE naturally fits conditional synthesis by reconstructing masked content from visible context in a single-pass, while latent space training enables semantic reconstruction, suppresses pixel space grid artifacts, and is computationally efficient. SiMAE employs a multi-contrast tokenizer with a shared encoder that maps each contrast into a common latent space and a joint decoder that outputs all contrasts simultaneously by aggregating cross-contrast cues. We train latent MAE with a two-phase curriculum: (i) pre-training with random token masking to learn general anatomical context, and (ii) fine-tuning with random contrast masking to specialize the model for missing-contrast synthesis. We introduce a subject token, regularized by a subject-identity separation (SIS) loss, that serves as a compact representation capturing anatomical identity and subject-specific features. The subject token is withheld from the decoder to impose an information bottleneck that encourages context-driven, token-level reconstruction. We further estimate uncertainty by repeatedly masking tokens and resynthesizing to generate uncertainty maps that highlight low-confidence regions. On BraTS 2021 and ADNI datasets, SiMAE achieves state-of-the-art synthesis quality and preserves fine anatomy and pathology.
Primary Area: applications to computer vision, audio, language, and other modalities
Submission Number: 5613
Loading