Self-Distillation Representation Learning and Parameter Efficient Fine-Tuning for Pretrained Models in Multimodal 3D Medical Imaging

Tony Xu; Anne Martel; Maged Goubran

Self-Distillation Representation Learning and Parameter Efficient Fine-Tuning for Pretrained Models in Multimodal 3D Medical Imaging

Tony Xu, Anne Martel, Maged Goubran

29 Aug 2025 (modified: 16 Sept 2025)MICCAI 2025 Challenge FLARE SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Self-supervised learning, Medical imaging, Parameter-efficient fine-tuning

TL;DR: Used self-supervised learning and LoRA on 3D medical imaging modalities to reduce reliance on expert annotations in the FLARE2025 Challenge Task 4.

Abstract: Applying deep learning (DL) to medical imaging generally requires large amounts of expert-annotated data. Foundation models, that are pretrained on huge unlabeled datasets offer a promising way to reduce this reliance. While self-supervised learning (SSL) has advanced foundation model development for 2D natural and medical images, extending these methods to 3D medical imaging remains computationally challenging and often constrained by limited pretraining data. In this work, we adapt the 3DINO framework—an extension of DINOv2 to volumetric inputs—to the MICCAI-FLARE25 Challenge Task 4, leveraging 20,000 unlabeled CT and MRI scans for pretraining. To efficiently transfer learned representations to diverse tasks while avoiding overfitting, we employ parameter-efficient fine-tuning via Low-Rank Adaptation (LoRA). Our results demonstrate that combining 3DINO pretraining with LoRA improves performance across segmentation, classification, survival prediction, and regression. These findings highlight the potential of SSL-based pretrained models to enable more label-efficient training for diverse 3D medical imaging applications.

Submission Number: 4

Loading