Keywords: Unpaired Contrastive Learning, Spatial Transcriptomics, Drug Discovery
Abstract: Multimodal learning holds tremendous promise for biology, providing a path to integrate diverse data types and ultimately construct a more complete picture of underlying biological mechanisms. However, most existing approaches for multimodal learning require paired samples---an impractical assumption in biology, where measurement devices often destroy samples (e.g., RNA sequencing). To address this challenge, we introduce IntraPair InterCluster (IPIC), a novel contrastive approach for multimodal learning that departs from traditional reliance on paired data by requiring only treatment-group labels. IPIC aligns modalities through intra-treatment group matching and inter-treatment group clustering, producing embeddings that are both accurate and biologically meaningful. In experiments on four curated multimodal biological datasets, IPIC consistently outperforms baseline approaches, highlighting its effectiveness in leveraging independently collected single-modality datasets for multimodal contrastive pre-training.
Primary Subject Area: Unsupervised Learning and Representation Learning
Secondary Subject Area: Application: Histopathology
Registration Requirement: Yes
Visa & Travel: No
Read CFP & Author Instructions: Yes
Originality Policy: Yes
Single-blind & Not Under Review Elsewhere: Yes
LLM Policy: Yes
Submission Number: 137
Loading