GEMCONT: Genetics-based Multimodal Contrastive Learning for Disease-Focused Imaging Genetics

Daniel Sens; Liubov Shilova; Adrian V Dalca; Julia Schnabel; Francesco Paolo Casale

GEMCONT: Genetics-based Multimodal Contrastive Learning for Disease-Focused Imaging Genetics

Daniel Sens, Liubov Shilova, Adrian V Dalca, Julia Schnabel, Francesco Paolo Casale

03 Dec 2025 (modified: 15 Dec 2025)MIDL 2026 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Multimodal Contrastive Learning, Imaging Genetics, Genome-Wide Association Studies, Machine Learning–Derived Phenotypes, Medical Imaging

TL;DR: We propose GEMCONT, a genetics-based multimodal contrastive framework that aligns medical imaging with disease-specific variants to improve phenotype predictability and recovery of genetic associations.

Abstract: Genetic variation provides stable, time-invariant markers of disease risk and can therefore reveal upstream mechanisms underlying complex traits. Genome-wide association studies (GWAS) have identified thousands of loci associated with disease, yet most remain difficult to interpret because the intermediate phenotypes linking genotype to disease are unknown. Here, we address the question whether disease-associated genetic loci can be directly used to extract such risk-related features from quantitative phenotypes, including functional tests and medical imaging. We introduce GEMCONT, a multimodal contrastive learning framework that aligns genotype and phenotype representations in a shared latent space. By using known disease-associated variants as supervision, GEMCONT learns phenotypic embeddings guided by genetic information. To reflect the weak, additive nature of genetic effects, it employs a linear genetic encoder alongside a deep phenotypic encoder. We validate GEMCONT in controlled simulations and apply it to two real-world settings: spirometry curves for asthma and retinal fundus images for glaucoma. In both, GEMCONT improves disease risk prediction and enhances recovery of genetic associations compared with standard unsupervised or polygenic risk–based models. Altogether, our results demonstrate that incorporating stable genetic supervision into multimodal representation learning enables the extraction of genetically informed risk traits, refining disease phenotypes and improving the interpretability of association studies.

Primary Subject Area: Integration of Imaging and Clinical Data

Secondary Subject Area: Unsupervised Learning and Representation Learning

Registration Requirement: Yes

Visa & Travel: No

Read CFP & Author Instructions: Yes

Originality Policy: Yes

Single-blind & Not Under Review Elsewhere: Yes

LLM Policy: Yes

Submission Number: 321

Loading