Enhancing Lungs Tumor Detection with Generative Models

Pietro Picchione; Robin Ghyselinck; Benoit Frenay; Bruno Dumas

Enhancing Lungs Tumor Detection with Generative Models

Pietro Picchione, Robin Ghyselinck, Benoit Frenay, Bruno Dumas

Published: 15 Oct 2025, Last Modified: 31 Oct 2025BNAIC/BeNeLearn 2025 OralEveryoneRevisionsBibTeXCC BY 4.0

Track: Type A (Regular Papers)

Keywords: Deep learning, Generative models, SinGAN-Seg, Bronchoscopy, Data augmentation, Lungs cancer

Abstract: Early diagnosis of lung cancer largely relies on the interpretation of bronchoscopic images, a complex task that strongly depends on clinical expertise. Deep learning has shown potential in computer aided detection of lung cancer, but its performance remains limited by the scarcity of annotated data in endoscopic imaging. To address this constraint, we explore the use of synthetic data generated with SinGAN-Seg, a model capable of producing realistic images from a single example. In this study, 257 bronchoscopic images from 64 patients were used to train 257 individual SinGAN-Seg models. The generated images were then filtered and integrated in increasing proportions (5\% to 100\%) into the training sets of three classification models (ResNet-18, ResNet-50, EfficientNet-B7). Performance was evaluated on an independent validation set. The results show that a moderate addition of synthetic data (up to 33\% for ResNet-18 and 66\% for EfficientNet-B7) can improve model robustness, especially their generalization ability. However, an excess of artificial data leads to a decrease in sensitivity. These findings confirm the potential of generation-based augmentation, while emphasizing the importance of balanced integration. \end{abstract}\begin{abstract} Early diagnosis of lung cancer largely relies on the interpretation of bronchoscopic images, a complex task that strongly depends on clinical expertise. Deep learning has shown potential in computer aided detection of lung cancer, but its performance remains limited by the scarcity of annotated data in endoscopic imaging. To address this constraint, we explore the use of synthetic data generated with SinGAN-Seg, a model capable of producing realistic images from a single example. In this study, 257 bronchoscopic images from 64 patients were used to train 257 individual SinGAN-Seg models. The generated images were then filtered and integrated in increasing proportions (5\% to 100\%) into the training sets of three classification models (ResNet-18, ResNet-50, EfficientNet-B7). Performance was evaluated on an independent validation set. The results show that a moderate addition of synthetic data (up to 33\% for ResNet-18 and 66\% for EfficientNet-B7) can improve model robustness, especially their generalization ability. However, an excess of artificial data leads to a decrease in sensitivity. These findings confirm the potential of generation-based augmentation, while emphasizing the importance of balanced integration.

Serve As Reviewer: ~Robin_Ghyselinck1

Submission Number: 15

Loading