From Static to Dynamic Diagnostics: Boosting Medical Image Analysis via Motion-Informed Generative Videos

Published: 01 Jan 2024, Last Modified: 06 Nov 2024MICCAI (3) 2024EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: In the field of intelligent healthcare, the accessibility of medical data is severely constrained by privacy concerns, high costs, and limited patient cases, significantly hindering automated clinical assistance. Though previous efforts have been made to synthesize medical images via generative models, they are limited to static imagery that fails to capture the dynamic motions in clinical practice, such as contractile patterns of organ walls, leading to vulnerable prediction in diagnostics. To tackle this issue, we propose a holistic paradigm, VidMotion, to boost medical image analysis with generative medical videos, representing the first exploration in this field. VidMotion consists of a Motion-guided Unbiased Enhancement (MUE) to augment static images into dynamic videos at the data level and a Motion-aware Collaborative Learning (MCL) module to learn with images and generated videos jointly at the model level. Specifically, MUE first transforms medical images into generative videos enriched with diverse clinical motions, which are guided by image-to-video generative foundation models. Then, to avoid the potential clinical bias caused by the imbalanced generative videos, we design an unbiased sampling strategy informed by the class distribution prior statistically, thereby extracting high-quality video frames. In MCL, we perform joint learning with the image and video representation, including a video-to-image distillation and image-to-image consistency, to fully capture the intrinsic motion semantics for motion-informed diagnosis. We validate our method on extensive semi-supervised learning benchmarks and justify that VidMotion is highly effective and efficient, outperforming state-of-the-art approaches significantly. The code is available at https://github.com/CUHK-AIM-Group/VidMotion.
Loading