There Are No Shortcuts To Anywhere Worth Going: Identifying Shortcuts in Deep Learning Models for Medical Image Analysis

31 Jan 2024 (modified: 21 Mar 2024)MIDL 2024 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: shortcut learning, bias, prediction depth, model interpretation, clinical machine learning, spurious correlations, model robustness, generalization
Abstract: Many studies have reported human-level accuracy (or better) for AI-powered algorithms performing a specific clinical task, such as detecting pathology. However, these results often fail to generalize to other scanners or populations. Several mechanisms have been identified that confound generalization. One such is shortcut learning, where a network erroneously learns to depend on a fragile spurious feature, such as a text label added to the image, rather than scrutinizing the genuinely useful regions of the image. In this way, systems can exhibit misleadingly high test-set results while the labels are present but fail badly elsewhere where the relationship between the label and the spurious feature breaks down. In this paper, we investigate whether it is possible to detect shortcut learning and locate where the shortcut is happening in a neural network. We propose a novel methodology utilizing the sample difficulty metric Prediction Depth (PD) and KL divergence to identify specific layers of a neural network model where the learned features of a shortcut manifest. We demonstrate that our approach can effectively isolate these layers across several shortcuts, model architectures, and datasets. Using this, we show a correlation between the visual complexity of a shortcut, the depth of its feature manifestation within the model, and the extent to which a model relies on it. Finally, we highlight the nuanced relationship between learning rate and shortcut learning.
Submission Number: 270
Loading