What Works in Chest X-Ray Classification? A Case Study of Design Choices

Published: 20 Jun 2023, Last Modified: 19 Jul 2023IMLH 2023 PosterEveryoneRevisionsBibTeX
Keywords: chest x-rays, clinical, deep-learning, design choices, comparative analysis
TL;DR: We study modelling choices in chest x-ray classification and find very few improve over standard baseline choices
Abstract: Public competitions and datasets have yielded increasingly accurate chest x-ray prediction models. The best such models now match even human radiologists on benchmarks. These models go beyond "standard" image classification techniques, and instead employ design choices specialized for the chest x-ray domain. However, as a result, each model ends up using a different, non-standardized training setup, making it unclear how individual design choices---be it the choice of model architecture, data augmentation type, or loss function---actually affect performance. So, which design choices should we use in practice? Examining a wide range of model design choices on three canonical chest x-ray benchmarks, we find that by simply leveraging a (properly tuned) model composed of up standard image classification design choices, one can often match the performance of even the best domain-specific models. Moreover, starting from a "barebones," generic ResNet-50 with cross-entropy loss and no data augmentation, we discover that none of the proposed design choices---including broadly used choices like the DenseNet-121 architecture or basic data augmentation---consistently improve performance over that generic learning setup.
Submission Number: 15
Loading