Keywords: Contrastive Learning, Representation Learning, Disentanglement, MRI Sequences, Clinical MRI
TL;DR: We address the lack of fine-grained contrast representations in MRI by aligning image embeddings with DICOM metadata via contrastive learning, enabling contrast-aware representations for retrieval, classification, and harmonization.
Abstract: Magnetic Resonance Imaging (MRI) offers diverse contrasts and acquisition protocols, yet the lack of standardized labels across sites and scanners makes automated sequence classification and contrast-aware applications challenging. We propose a metadata-guided CLIP framework for learning 3D MRI contrast representations by aligning images with their DICOM metadata. This alignment enables the model to capture both contrast-specific and acquisition-related variations, yielding embeddings that support diverse downstream tasks such as image–metadata retrieval and sequence classification, and can further serve as a foundation for contrast-invariant representation learning and cross-site harmonization. Evaluated on a large and heterogeneous clinical MRI dataset, our framework yields well-structured latent spaces, achieves strong image metadata retrieval, and forms meaningful unsupervised clusters of MRI sequences. Furthermore, the learned embeddings enable competitive few-shot sequence classification performance compared to fully supervised 3D networks. Code and weights are publicly available at [anonymised].
Submission Number: 13
Loading