On the Correlations between Performance of Deep Networks and Its Robustness to Common Image Perturbations in Medical Image Interpretation

Chak Fong Chong, Xinyi Fang, Xu Yang, Wuman Luo, Yapeng Wang

Published: 01 Jan 2023, Last Modified: 10 Feb 2025DICTA 2023EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: The robustness of medical image interpretation deep learning models to common image perturbations is crucial, as the medical images in clinical applications may be from different institutions and contain various perturbations that did not appear in training data, decreasing the interpretation performance. In this paper, we investigate the correlations of the robustness of 28 ImageNet models under 6 image perturbation types over 10 severity levels on the CheXpert chest X-ray (CXR) classification dataset. The results demonstrate that: (1) If a model has a higher ImageNet accuracy, after fine-tuning it on CheXpert for CXR classification, it tends to be more robust on perturbed CXRs. (2) If a model has a higher CXR classification performance after fine-tuning on CheXpert, it is not necessarily more robust on perturbed CXRs, depending on the severity levels of the perturbations. Under stronger perturbations, lower CXR performance models tend to be more robust instead. (3) The model architectures may be a key factor to the robustness. For instance, no matter how large the models are, EfficientNet and EfficientNetV2 models tend to be more robust, while ResNet models tend to be more vulnerable. Our work can help select or design robust models for medical image interpretation to improve the capability for clinical applications.