Keywords: digital pathology, computer vision, distributionally robust optimization
TL;DR: distributionally robust optimization helps models prevent overfitting to spurious correlates in digital pathology domain
Abstract: Computer vision (CV) approaches applied to digital pathology have informed biological discovery and clinical decision-making. However, batch effects in images represent a major challenge to effective analysis. A CV model trained using Empirical Risk Minimization (ERM) risks learning batch-effects when they may align with the labels and serve as spurious correlates. The standard methods to circumvent learning such confounders include (i) application of image augmentation techniques and (ii) examination of the learning process by evaluating through external validation (e.g., unseen data coming from a comparable dataset collected at another hospital). The latter approach is data-hungry and the former, risks occluding biological signal. Here, we suggest two solutions from the Distributionally Robust Optimization (DRO) families. Our contributions are i) a DRO algorithm using abstention which is a slight variation over existing abstention-based DRO algorithms and ii) a group-DRO method where groups are defined as hospitals from which data are collected. We find that the model trained using abstention-based DRO outperforms a model trained using ERM by 9.9% F1 in identifying tumor vs. normal tissue in lung adenocarcinoma (LUAD) at the expense of coverage. Further, by examining the areas abstained by the model with a pathologist, we find that the model trained using a DRO method is more robust to heterogeneity and artifacts in the tissue. Together, we propose selecting models that are more robust to spurious features for translational discovery and clinical decision support.