Keywords: Generative Adversarial Networks, Histology, Machine Learning, Whole Slide Imaging
TL;DR: The presented work aimed at finding criteria to ensure efficient GAN-generated datasets generation and their effect on the quality of subsequent segmentation
Abstract: In histopathology, staining quantification is a mandatory step to unveil and characterize disease progression and assess drug efficiency in preclinical and clinical settings. Supervised Machine Learning (SML) algorithms allow the automation of such tasks but rely on large learning datasets, which are not easily available for pre-clinical settings. Such databases can be extended using traditional Data Augmentation methods, although generated images diversity is heavily dependent on hyperparameter tuning. Generative Adversarial Networks (GAN) represent a potential efficient way to synthesize images with a parameter-independent range of staining distribution. Unfortunately, generalization of such approaches is jeopardized by the low quantity of publicly available datasets. To leverage this issue, we propose a hybrid approach, mixing traditional data augmentation and GAN to produce partially or completely synthetic learning datasets for segmentation application. The augmented datasets are validated by a two-fold cross-validation analysis using U-Net as a SML method and F-Score as a quantitative criterion.
Registration: I acknowledge that publication of this at MIDL and in the proceedings requires at least one of the authors to register and present the work during the conference.
Authorship: I confirm that I am the author of this work and that it has not been submitted to another publication before.
Paper Type: validation/application paper
Primary Subject Area: Application: Histopathology
Secondary Subject Area: Segmentation