Abstract: Foundation models (FMs) have introduced new opportunities for zero-shot generalization in downstream tasks through techniques such as linear probing and feature extraction. However, a systematic evaluation of their fairness regarding their sensitivity to covariate bias remains lacking, which is critical for clinical translation. In this study, we address this gap by constructing a unique dataset comprising identical glass slides digitized using two scanners, enabling a controlled simulation of covariate bias in data distributions. The same tissue regions (patches) are extracted from Whole Slide Images acquired on each scanner and processed through FMs to obtain zero-shot feature representations. We define and quantify ‘representation shift’, as the difference in feature vectors across scanners, and assess using metrics such as mean squared error, Kullback–Leibler divergence, and a novel clustering-based Calinski–Harabasz index. Our results demonstrate that FMs exhibit significant scanner-dependent variability in feature representations, highlighting a key generalization limitation. This sensitivity to acquisition device introduces the risk of unequal or unfair performance; an important consideration for the safe and fair deployment of FMs in clinical settings.
External IDs:dblp:conf/miccai/ShafiqueDQADAK25
Loading