Keywords: Multi-domain Image-to-image translation, Domain Generalization, Data augmentation, Histopathology
TL;DR: Image-to-image translation model trained on related histopathology data and used as data augmentation can make model generalize better on out-of-distribution data.
Abstract: Histopathology Whole Slide Images (WSIs) present large illumination or color variations due to protocol variability (scanner, staining). This can strongly harm the generalization performances of deep learning algorithms. To address this problem, we propose to train a multi-domain image-to-image translation (I2IT) model on WSIs from The Cancer Genome Atlas Program (TCGA) and use it for data augmentation. Using TCGA WSIs from different cancer types has several advantages: our data augmentation method can be used for tasks where data is small, the I2IT model does not need to be relearned for each task and the variability of TCGA protocols is high leading to better robustness. The method efficiency is assessed on the Camelyon17 WILDS dataset where we outperform sophisticated data augmentations and domain generalization methods. Results also confirms that training the I2IT model on unrelated histopathology data is much more efficient for generalization than training it on the training data of the domain generalization (DG) task.
Registration: I acknowledge that acceptance of this work at MIDL requires at least one of the authors to register and present the work during the conference.
Authorship: I confirm that I am the author of this work and that it has not been submitted to another publication before.
Paper Type: novel methodological ideas without extensive validation
Primary Subject Area: Transfer Learning and Domain Adaptation
Secondary Subject Area: Application: Histopathology
Confidentiality And Author Instructions: I read the call for papers and author instructions. I acknowledge that exceeding the page limit and/or altering the latex template can result in desk rejection.