Multimodal Deep Learning for Phyllodes Tumor Classification from Ultrasound and Clinical Data

Published: 19 Aug 2025, Last Modified: 24 Sept 2025BSN 2025EveryoneRevisionsBibTeXCC BY 4.0
Confirmation: I have read and agree with the IEEE BSN 2025 conference submission's policy on behalf of myself and my co-authors.
Keywords: phyllodes tumor, breast ultrasound, multimodal AI, feature fusion, class-aware sampling
Abstract: Phyllodes tumors (PTs) are rare fibroepithelial breast lesions that can be malignant but are difficult to classify preoperatively due to their radiological similarity to benign fibroadenomas. This often leads to unnecessary surgical excisions. To address this, we propose a multimodal deep learning framework that integrates breast ultrasound (BUS) images with structured clinical data to improve diagnostic accuracy. We developed a dual-branch neural network that extracts and fuses features from ultrasound images and patient metadata from 81 subjects with confirmed PTs. Class-aware sampling and subject-stratified 5-fold cross-validation were applied to mitigate class imbalance and data leakage. The results show that our proposed multimodal method outperforms unimodal baselines in classifying benign versus borderline/malignant PTs. Among six deep learning-based image encoders, ConvNeXt and ResNet18 achieved the best performance in the multimodal setting, with AUC-ROC scores of 0.9427 and 0.9349, and F1-scores of 0.6720 and 0.7294, respectively. This study demonstrates the potential of multimodal AI to serve as a non-invasive diagnostic tool, reducing unnecessary surgical excisions and improving clinical decision-making in breast cancer care.
Track: 3. Signal processing, machine learning, deep learning, and decision-support algorithms for digital and computational health
NominateReviewer: Farhan Fuad Abir (farhan.fuad@ucf.edu)
Submission Number: 86
Loading