STC3D: Self-Supervised Contrastive Learning with Spatial Transformations for 3D Medical Image Analysis

STC3D: Self-Supervised Contrastive Learning with Spatial Transformations for 3D Medical Image Analysis

TMLR Paper4581 Authors

31 Mar 2025 (modified: 26 Jun 2025)Rejected by TMLREveryoneRevisionsBibTeXCC BY 4.0

Abstract: Self-Supervised Learning (SSL) has demonstrated promising results in 3D medical image analysis, but traditional SSL methods often lack high-level semantics during pre-training, limiting performance on downstream tasks. Recent methods like Volume Contrast (VoCo) have addressed this by leveraging contextual position priors in 3D images, but VoCo relies on random cropping, which may reduce robustness to anatomical variations. In this paper, we propose STC3D, a novel SSL framework that applies controlled spatial transformations (rotation, translation, scaling) to generate multiple views of 3D volume images. These transformed views are then used for contrastive learning, enhancing invariance to anatomical structure transformations. Additionally, STC3D includes a regularization branch to promote feature discrepancy between different base slices, improving the discriminative power of learned representations. Experimental results on several benchmark datasets, including BTCV, MSD Spleen, MM-WHS, and BraTS 21, demonstrate that STC3D outperforms existing methods in segmentation and classification tasks.

Submission Length: Long submission (more than 12 pages of main content)

Assigned Action Editor: ~Lei_Wang13

Submission Number: 4581

Loading