Virtual-Eyes: Quantitative Validation of a Lung CT Quality-Control Pipeline for Foundation-Model Cancer Risk Prediction
Keywords: Lung Cancer Screening, Foundation Models, Quality Control, CT Preprocessing, Validation
TL;DR: A lung-aware 16-bit CT preprocessing pipeline that significantly stabilizes and improves foundation-model performance for LDCT cancer risk prediction while revealing shortcut dependence in specialist models.
Abstract: Robust preprocessing is rarely quantified in deep-learning pipelines for low-dose CT (LDCT) lung cancer screening. We develop and validate \emph{Virtual-Eyes}, a clinically motivated, 16-bit CT quality-control pipeline for NLST, and measure its differential impact on generalist foundation models (FMs) versus specialist models. Virtual-Eyes enforces strict $512\times512$ in-plane resolution, rejects short or non-diagnostic series, and extracts a contiguous lung block using Hounsfield-unit filtering and bilateral lung-coverage scoring, while preserving the original 16-bit DICOM grid. Using 765 NLST patients (182 cancer, 583 non-cancer), we compute slice-level embeddings from RAD-DINO and Merlin with frozen encoders and train leakage-free patient-level MLP heads. We also apply Virtual-Eyes to Sybil and a 2D ResNet-18 baseline without retraining their backbones. For RAD-DINO, preprocessing improves slice-level AUC from 0.576 to 0.610 and patient-level AUC from 0.646 to 0.683 (mean pooling) and 0.619 to 0.735 (max pooling). These gains are accompanied by reduced distributional drift between raw and preprocessed outputs (KS $D=0.041$, $p<10^{-80}$) and better calibration (Brier score $0.188 \to 0.112$). In contrast, Sybil and ResNet-18 degrade under Virtual-Eyes (Sybil AUC $0.886 \to 0.837$, ResNet-18 $0.571 \to 0.596$) and show evidence of shortcut or context-dependent learning, with Sybil becoming overconfident (Brier $0.092 \to 0.145$). Merlin exhibits limited transferability to thoracic risk prediction (AUC $\approx 0.507$--$0.567$) regardless of preprocessing. To our knowledge, this is the first quantitative validation of lung-aware preprocessing for LDCT foundation-model workflows. Our results highlight that anatomically targeted QC can meaningfully stabilize and improve generalist FMs, but may disrupt specialist models that have adapted to raw clinical context.
Primary Subject Area: Foundation Models
Secondary Subject Area: Transfer Learning and Domain Adaptation
Registration Requirement: Yes
Reproducibility: Yes. We will publicly release the full Virtual-Eyes code (including HU-based lung detection and block extraction), configuration files, and MLP training scripts. NLST is accessible through the National Cancer Institute; we will provide instructions to reproduce splits and evaluation. In parallel, we are working with The Cancer Imaging Archive to integrate Virtual-Eyes as a reusable preprocessing option for NLST and related LDCT collections, so that users will be able to apply the same QC pipeline directly within TCIA and download preprocessed lung blocks for downstream modeling.
Visa & Travel: No
Read CFP & Author Instructions: Yes
Originality Policy: Yes
Single-blind & Not Under Review Elsewhere: Yes
LLM Policy: Yes
Submission Number: 38
Loading