xMADD: A Unified Diffusion Framework for Conditioned Synthesis of Medical Images and Waveforms

Sam Freesun Friedman, Sana Tonekaboni, Arash A. Nargesi, Caroline Uhler, Mahnaz Maddah

Published: 27 Nov 2025, Last Modified: 09 Dec 2025ML4H 2025 PosterEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Diffusion, Generative models, Dataset augmentation, Modality translation, Digital twins
TL;DR: We present a simple versatile method for generating high resolution medical images and waveforms conditioned by a wide variety of phenotypes and cross-modal representations.
Track: Proceedings
Abstract: Diffusion models have shown remarkable success in generating high-quality perceptual data, but their use for controlled generation in biomedicine remains limited. We introduce xMADD (cross-Modal cross-Attention Denoising Diffusion), a conditional diffusion framework for producing diverse, high-resolution medical data, including cardiac MRI, brain MRI, and ECG waveforms, guided by clinical phenotypes, demographics, and multimodal signals. By incorporating cross-attention over conditional embeddings, xMADD enables control over generation. Compared to existing self-supervised and unsupervised generative approaches, xMADD achieves superior image fidelity and stability, while accurately reflecting conditioning phenotypes across modalities. Our results highlight the potential of controlled diffusion-based generation to expand biomedical datasets and facilitate data-sharing without compromising sensitive patient data.
General Area: Models and Methods
Specific Subject Areas: Medical Imaging, Representation Learning, Foundation Models
Data And Code Availability: Yes
Ethics Board Approval: Yes
Entered Conflicts: I confirm the above
Anonymity: I confirm the above
Submission Number: 62
Loading