Abstract: Lung cancer persists as a global leader in cancer-related deaths, highlighting the critical need for precise and efficient detection methods. This paper investigates the use of the Medical Segmentation Decathlon dataset to train neural networks for lung cancer segmentation in CT scans via semantic segmentation. We propose and evaluate four new data adaptation techniques specifically designed for this dataset, with each technique being assessed using U-Net-based architectures. Our approach incorporates a thorough exploratory data analysis to uncover the dataset's strengths and weaknesses, which in turn guided our data preprocessing and augmentation strategies.
Loading