Advanced deep learning architectures for enhanced mammography classification: a comparative study of CNNs and ViT

Shubhi Sharma, Yeshwant Singh, Tanupriya Choudhury

Published: 2025, Last Modified: 22 Jan 2026Discov. Artif. Intell. 2025EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Breast cancer is a leading cause of mortality among women worldwide, and early detection via mammography is critical for improving patient outcomes. In this study, we conduct a comprehensive comparative analysis of 15 state-of-the-art deep learning models including both general-purpose (e.g., ResNet50, ConvNeXt, ViT) and mammogram-specific architectures (e.g., FCCS-Net, ViT-Mammo, GLAM-Net) for breast cancer classification using mammographic images. Leveraging four publicly available datasets, we evaluate all models using a unified and reproducible pipeline under standardized training protocols. Beyond traditional performance metrics, our multi-dimensional assessment includes interpretability via Grad-CAM and attention maps, calibration reliability, inference time, model complexity, and deployment feasibility. Our findings highlight the superior diagnostic accuracy and visual interpretability of mammogram-specific models, particularly FCCS-Net and ViT-Mammo, with ViT-Mammo achieving an AUC of 0.961. Lightweight architectures such as EfficientNetB0 and DenseNet121 demonstrate strong potential for edge deployment due to their efficiency and competitive accuracy. The study further explores model robustness across datasets, calibration reliability, and real-time deployment constraints, offering actionable insights for clinical integration of AI-driven diagnostic tools. This work provides a valuable benchmarking framework and paves the way for the development of interpretable, efficient, and clinically viable AI systems for breast cancer screening.

External IDs:dblp:journals/dai/SharmaSC25