MESH-HR: Multimodal Fusion of Histopathology and Structured Somatic Genomics for Continuous Breast Cancer Receptor Subtyping
Keywords: multimodal learning, structured tabular data, high-dimensional genomics, whole-slide imaging, multiple instance learning, digital biomarkers, clinical risk stratification, probabilistic modeling, continuous predictions, zero-shot generalization, domain shift, cancer of unknown primary, representation learning, structured data fusion, interpretability
TL;DR: MESH-HR fuses structured genomic tables with histopathology to predict continuous breast cancer receptor subtypes without IHC, outperforming imaging-only methods and enabling stratification in cancers of unknown primary.
Abstract: Breast cancer receptor subtyping guides treatment and prognosis, as eligibility for targeted therapies is determined by estrogen receptor (ER), progesterone receptor (PR), and human epidermal growth factor receptor 2 (HER2) expression or amplification status. While these biomarkers are clinically assessed using immunohistochemistry (IHC), computational approaches have largely focused on directly predicting receptor status from histopathology images, despite complementary signal in somatic genomic profiles. We introduce MESH-HR (Multimodal Ensemble of Somatic Variants and H\&E Slides for Hormone Receptor subtyping), a model that integrates structured somatic genomic features and multi-resolution pathology images to predict ER, PR, and HER2 receptor status probabilities from routinely collected data, without requiring IHC. MESH-HR combines an attention-based multiple instance learning (ABMIL) vision encoder with an XGBoost model over structured genomics, capturing complementary signal across modalities. Trained on $>$1,300 breast cancers, MESH-HR achieves AUCs of 0.90 (ER), 0.84 (PR), and 0.96 (HER2) on held-out data, outperforming unimodal and prior imaging-only approaches, and generalizes zero-shot to The Cancer Genome Atlas Breast Cancer cohort (TCGA-BRCA). Continuous predictions improve survival stratification over binary outputs and discretized clinical IHC categories, recovering signal lost under discrete clinical labeling. We further apply MESH-HR to cancers of unknown primary (CUP), where receptor status is often unavailable, obtaining biologically consistent, survival-predictive subtype estimates that enable receptor-informed stratification.
Submission Number: 26
Loading