Pan-Cancer Tumor Infiltrating Lymphocyte Detection based on Federated Learning

Ujjwal Baid, Sarthak Pati, Tahsin M. Kurç, Rajarsi Gupta, Erich Bremer, Shahira Abousamra, Siddhesh P. Thakur, Joel H. Saltz, Spyridon Bakas

Published: 01 Jan 2024, Last Modified: 04 Mar 2025IEEE Big Data 2024EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Advances in deep learning (DL) have shown great promise in revolutionizing healthcare, notwithstanding their success hinging on the availability of centralized large and diverse data. Such centralization is challenging because of numerous concerns relating to privacy, data-ownership, intellectual property, and compliance with varying regulatory policies. Federated learning (FL), offers a new decentralized paradigm to train DL models in healthcare. In this study, we evaluate the effect of FL in developing DL models for the analysis of digitized tissue sections, specifically whole slide images (WSIs). A classification application was considered as the example use case, to quantify the distribution of Tumor Infiltrating Lymphocytes (TILs), which are a critical biomarker in cancer research, providing valuable insights into patient outcomes. We trained a VGG classification model using 50 × 50 micron patches extracted from the WSIs with their associated TIL/nonTIL label. We simulated a FL environment, where different cancer types are included across each collaborating node. Our results show that the model trained with the federated training approach achieves similar performance, both quantitatively and qualitatively, to that of a model trained with all the training data pooled at a centralized location. Our study shows that FL has tremendous potential for enabling the development of more robust and accurate models for histopathology image analysis without having to collect large and diverse training data at a single location. Particularly for TILs, our FL approach yields a single DL model trained across numerous anatomical sites and able to robustly generalize to unseen cancer types.