Improving Skin Lesion Classification based on Fusion Multi-Learning Models

Trung-Phien Nguyen; Van-Dung Hoang; Minh-Luan Su; Thi-Nhat-Vy Nguyen; Trong-Tri Bui; Ba-Duy Nguyen

Improving Skin Lesion Classification based on Fusion Multi-Learning Models

Trung-Phien Nguyen, Van-Dung Hoang, Minh-Luan Su, Thi-Nhat-Vy Nguyen, Trong-Tri Bui, Ba-Duy Nguyen

Published: 01 Jan 2024, Last Modified: 19 Jun 2025DSC 2024EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Skin cancer is a leading malignant disease with rising incidence rates, emphasizing the need for early and accurate diagnosis. This paper introduces a new fusion method for classifying multiple classes of skin lesions by combining the Vision Transformer (ViT) and Vision Permutator (ViP) models. The proposed method leverages the global attention mechanism of ViT and the spatial encoding capabilities of ViP to enhance classification performance. Additionally, various data augmentation techniques, such as random zoom, flip, shift, and range adjustments are applied to tackle the issue of class imbalance. The proposed method is evaluated and analyzed with the ISIC2019 dataset. The models have trained on ISIC2019 dataset without being pretrained on a large dataset, e.g. ImageNet. The experimental results demonstrated that the fusion models, particularly Fusion cat and Fusion max, achieved superior performance compared to individual ViT and ViP models. Specifically, Fusion max attained accuracy of 80.86%, while Fusion cat reaches in 77.96% mean recall, 76.81% mean precision, and 77.38% F1-score. These findings suggest that our proposed models can significantly enhance automated skin lesion classification, contributing to the early diagnosis of skin cancer.

Loading