Comparison of CNN models on a multi-scanner database in colon cancer histologyDownload PDF

Published: 11 May 2021, Last Modified: 16 May 2023MIDL 2021 PosterReaders: Everyone
Keywords: Histopathology, Data Augmentation, Tissue Classification, CNN
TL;DR: Different CNN models were trained on data from one scanner applying data augmentation and were compared on a multi-scanner database for tissue classification.
Abstract: One of the most important challenges for computer-aided analysis in digital pathology is the development of robust deep neural networks, which can cope with variations in color and resolution of digitized whole-slide images (WSIs). It has been shown that color augmentation during training is a useful method to aid a model generalize better to heterogeneous data. In this work, we compare state of the art models EfficientNet, Xception, Inception, ResNet, DenseNet, MobileNet and QuickNet on a multi-scanner database comprising slides each digitized with six different scanners. All of the networks are trained with data of only one scanner applying a combination of color and blur augmentation techniques. All models show similar tendencies across the different scanner databases but differ in the overall classification accuracy. Differences in training and inference time, however, are more pronounced: on a mid-range GPU, the inference time of the fastest model (QuickNet) is 13 times faster than the slowest one (EfficientNet B4). There is also a trade-off between speed and accuracy, the slower networks are more stable across different scanners and show the overall best performance. A good compromise between quality and inference time is achieved by EfficientNet B0.
Paper Type: validation/application paper
Primary Subject Area: Application: Histopathology
Secondary Subject Area: Transfer Learning and Domain Adaptation
Paper Status: original work, not submitted yet
Source Code Url: The source code and data set are part of a larger framework which was and is partially funded by Fraunhofer internal grants. The revenue earned by licensing the source code or data (or a technology derived from the data) is required to fund future research and in particular increase the technology readiness level of the described innovation to ensure a market entry and by that a wide availability to users. Therefore the source code is not publicly available.
Data Set Url: The source code and data set are part of a larger framework which was and is partially funded by Fraunhofer internal grants. The revenue earned by licensing the source code or data (or a technology derived from the data) is required to fund future research and in particular increase the technology readiness level of the described innovation to ensure a market entry and by that a wide availability to users. Therefore the data is not publicly available.
Registration: I acknowledge that publication of this at MIDL and in the proceedings requires at least one of the authors to register and present the work during the conference.
Authorship: I confirm that I am the author of this work and that it has not been submitted to another publication before.
Source Latex: zip
4 Replies

Loading