Leveraging Explainable Artificial Intelligence for Understanding the Effect of Model Capacity on Training Dataset Size

Tushar Prakash, Tashvik Dhamija, Rohit Kumar, Jeebananda Panda

Published: 2022, Last Modified: 11 Feb 2025SOLI 2022EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Automated Image classification has seen substantial growth in recent times. However, several applications suffer from the limited availability of training data, such as the classification of medical images, where data collection is mostly limited by privacy concerns of human subjects. As a result, to compensate for the limited availability of training data, most of these applications employ custom-made lightweight architectures. While state-of-the-art deep models for computer vision applications usually exploit architectures with huge model capacity. However, the increase in model complexity and size necessarily doesn't guarantee better performance on small medical datasets. We study this phenomenon in the context of medical images, where several existing studies report that most sophisticated deep networks for computer vision trained on large datasets such as Image Net do not generalize on medical image applications, due to huge model capacity, subsequently leading to overfitting on smaller training datasets. In this research, we exploit explainable artificial intelligence to analyze the features learned by state-of-the-art deep models for smaller medical image training datasets and contrast them with the features learned for larger medical training datasets. In particular, we exploit Shapley Additive explanations (SHAP) features to perform a qualitative comparison of feature relevance maps and understand how different standard models when trained with different training sizes understand discriminative image patterns to perform classification. Furthermore, we also compare SHAP features on scenarios in which the same model focuses on images belonging to different classes. Experiments on two datasets of different sizes have been presented to understand the dependence of model complexity on the number of samples in the training dataset. Results demonstrate that simpler models learn generalizable SHAP features that allow them to perform better on small datasets, unlike larger models when trained on smaller datasets. Likewise, bigger models when trained on larger datasets learn more distinctive and diverse features that allow them to outperform smaller models.