WDM: 3D Wavelet Diffusion Models for High-Resolution Medical Image Synthesis

Paul Friedrich; Julia Wolleb; Florentin Bieder; Alicia Durrer; Philippe C. Cattin

WDM: 3D Wavelet Diffusion Models for High-Resolution Medical Image Synthesis

Paul Friedrich, Julia Wolleb, Florentin Bieder, Alicia Durrer, Philippe C. Cattin

Published: 01 Jan 2024, Last Modified: 16 Dec 2024DGM4MICCAI@MICCAI 2024EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Due to the three-dimensional nature of CT- or MR-scans, generative modeling of medical images is a particularly challenging task. Existing approaches mostly apply patch-wise, slice-wise, or cascaded generation techniques to fit the high-dimensional data into the limited GPU memory. However, these approaches may introduce artifacts and potentially restrict the model’s applicability for certain downstream tasks. This work presents WDM, a wavelet-based medical image synthesis framework that applies a diffusion model on wavelet decomposed images. The presented approach is a simple yet effective way of scaling 3D diffusion models to high resolutions and can be trained on a single 40 GB GPU. Experimental results on BraTS and LIDC-IDRI unconditional image generation at a resolution of \(128 \times 128 \times 128\) demonstrate state-of-the-art image fidelity (FID) and sample diversity (MS-SSIM) scores compared to recent GANs, Diffusion Models, and Latent Diffusion Models. Our proposed method is the only one capable of consistently generating high-quality images at a resolution of \(256 \times 256 \times 256\), outperforming all comparing methods. The project page is available at https://pfriedri.github.io/wdm-3d-io.

Loading