Confirmation: I have read and agree with the IEEE BHI 2025 conference submission's policy on behalf of myself and my co-authors.
Keywords: Biomarkers, classification algorithms, feature selection, gut microbiome, lung adenocarcinoma, machine learning, medical screening
Abstract: Lung adenocarcinoma (LUAD) represents a major global health challenge requiring more accessible and non-invasive screening methods. Traditional diagnostic approaches such as computed tomography or biopsies are effective but costly, resource-intensive, and carry associated risks. This study leverages gut microbiome data and machine learning techniques to develop a non-invasive pre-screening tool for LUAD. Using a dataset of 107 fecal samples (43 LUAD and 64 healthy controls), we explored the performance of nine machine learning algorithms and four distinct feature sets generated through feature selection methods to identify informative microbial biomarkers and construct accurate classification models. Our results demonstrate that feature selection significantly enhances model performance compared to baseline approaches. A Random Forest model combined with Correlation-based Feature Selection achieved an Area Under the Curve of 0.9967. Key taxa including Prevotella, Coprococcus, Phascolarctobacterium, Bilophila, Blautia, Enterococcus, and Bacteroides emerged as potential biomarkers. Functional predictions using PICRUSt2 revealed significant alterations in folate metabolism, methylation cycles, and photosynthetic bacterial activity, highlighting disrupted gut microbiome function in LUAD patients. These findings align with previous studies and suggest promising directions for non-invasive and cost-effective screening methods.
Track: 2. Bioinformatics
Registration Id: QYNDZ5X54DX
Submission Number: 170
Loading