- Keywords: digital pathology, deep learning, domain transfer, BRAF, NTRK, onco-genic driver detection, WSI classification, attention multiple instance learning, histology, histopathology, H&E slide
- TL;DR: We train classification models to detect oncogenic drivers in histopathology images and assess the model generalization within and across different cohorts
- Abstract: In this paper, we describe the machine learning problem of identifying different types of tumors based on digital pathology images. Given a set of Hematoxylin and Eosin (H&E) stained images of thyroid tumors, we train deep learning models to detect two known molecular oncogenic drivers: BRAF mutations and NTRK gene fusions. We implement an attention-based multiple instance learning (MIL) classifier and we assess its generalization within and across three independent cohorts. We find that the model can detect both oncogenic drivers with the MIL approach, however the problem remains challenging: our exhaustive evaluation scenarios exemplify unknown data drifts and batch effects in digital pathology as the model performance decreases when processing images from an unseen cohort. These findings highlight the necessity of rich and diverse datasets for training and evaluation as well as methods for domain-agnostic learning.