Learning from Missing Data: A Multimodal Hierarchical Variational Auto-Encoders for Medical Image Synthesis
Keywords: Missing Data, Hierarchical Variational Autoencoders, Image Synthesis
TL;DR: We present a principled multimodal hierarchical VAE that learns from incomplete data to synthesize missing images from any input subset, outperforming GAN, transformer, and diffusion baselines and improving segmentation and registration.
Registration Requirement: Yes
Abstract: Missing data is a widespread problem in multimodal medical imaging that presents significant practical and methodological challenges. While cross-modal synthesis has emerged as a promising strategy to estimate unavailable modalities from observed ones, most existing unified approaches assume complete multimodal datasets during training and rely on heuristic fusion strategies. In this short paper, we highlight the main ideas and findings of our recent work, where we introduced a Mixture of Multimodal Hierarchical Variational Auto-Encoders (MMHVAE) for unified cross-modal medical image synthesis from incomplete data. The model combines a hierarchical latent representation with a mixture of product-of-experts posterior to encode observed information, estimate missing information, and fuse arbitrary subsets of available modalities. The method was validated on the challenging problem of synthesis between multi-parametric MRI and intraoperative ultrasound in brain tumor patients, with additional evaluation on two downstream tasks: segmentation and registration.
This work shows that principled probabilistic models can learn rich and informative multimodal representations from incomplete imaging datasets.
Visa & Travel: No
Read CFP & Author Instructions: Yes
Originality Policy: Yes
Single-blind & Not Under Review Elsewhere: Yes
LLM Policy: Yes
Submission Number: 56
Loading