Generalized Representation Learning for Multimodal Histology Imaging Data Through Vision-Language Modeling
Track: Tiny Paper Track
Keywords: multiplexed spatial proteomics, digital pathology, contrastive learning, vision language modeling
Abstract: We introduce a trimodal vision-language framework that unifies multiplexed spatial proteomics (SP), H&E histology, and textual metadata in a single embedding space. A specialized transformer-based SP encoder, alongside pretrained H&E and language models, captures diverse morphological, molecular, and semantic signals. Preliminary results demonstrate improved retrieval, zero-shot classification, and patient-level phenotype predictions, indicating the promise of this multimodal approach for deeper insights and translational applications in digital pathology.
Attendance: Jacob Leiby
Submission Number: 49
Loading