A Novel Approach for Breast Tumor MRI Classification: Vision Transformers and Majority Integration

Junpei Xue, Leilei Zhou, Yuchen Chen, Jin-Xia Zheng, Jiang Liu

Published: 01 Jan 2024, Last Modified: 10 May 2025ICC 2024EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Breast tumor is one of the most common malignant cancers in women. Precise classification of breast tumors is pivotal for clinical treatment. Recently, deep learning, especially the Convolutional Neural Network (CNN), has been commonly used to address this problem. However, CNN models have been observed to inadequately capture global information. Hence, to further improve the accuracy, the Vision Transformer method was utilized in this study for breast tumor classification based on magnetic resonance imaging (MRI). Additionally, a majority decision fusion strategy was incorporated to enhance classification performance. Initially, the lesion part was manually segmented and extracted from MRI images. Subsequently, three grayscale images from the same individual were combined into a single RGB image, allowing it to be input into the ViT. Once the training phase was completed, a voting mechanism was applied. Given that MRI images for each patient present a series to depict the three-dimensional structure of the breast, a majority-based voting approach was adopted to refine accuracy. As a result, an accuracy of 91.89% was achieved by the model, surpassing VGG16, ResNet50, and other models. With the inclusion of voting, the accuracy of the ViT was observed to reach 98.2%. As the quality of medical data evolves and the benefits of sizable models in addressing image challenges become more evident, it is anticipated that more such models will be integrated into the healthcare domain. Consequently, this study may provide invaluable insights for researchers aiming to enhance performance in medical imaging, especially in breast tumor classification.