DOLG-NeXt: Convolutional neural network with deep orthogonal fusion of local and global features for biomedical image segmentation

Md. Rayhan Ahmed; Md. Asif Iqbal Fahim; A. K. M. Muzahidul Islam; Salekul Islam; Swakkhar Shatabda

DOLG-NeXt: Convolutional neural network with deep orthogonal fusion of local and global features for biomedical image segmentation

Md. Rayhan Ahmed, Md. Asif Iqbal Fahim, A. K. M. Muzahidul Islam, Salekul Islam, Swakkhar Shatabda

Published: 01 Jan 2023, Last Modified: 27 Sept 2024Neurocomputing 2023EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Biomedical image segmentation (BMIS) is an essential yet challenging task for the visual analysis of biomedical images. Modern deep learning-based architectures, such as UNet, UNet-based variants, Transformers-based networks, and their combinations, have achieved reasonable success in BMIS. However, they still face certain shortcomings in extracting fine-grained features. They are also limited by scenarios where the modeling of local and global feature representations needs to be optimized correctly for spatial dependency in the decoding process, which can result in duplicate data utilization throughout the architecture. Besides, Transformer-based models lack inductive bias in addition to the complexity of the models. As a result, it can perform unsatisfactorily in a lesser biomedical image setting. This paper proposes a novel encode-decoder architecture named DOLG-NeXt, incorporating three major enhancements over the UNet-based variants. Firstly, we integrate squeeze and excitation network (SE-Net)-driven ConvNeXt stages as encoder backbone for effective feature extraction. Secondly, we employ a deep orthogonal fusion of local and global (DOLG) features module in the decoder to retrieve fine-grained contextual feature representations. Finally, we construct a SE-Net-like lightweight attention network alongside the DOLG module to provide refined target-relevant channel-based feature maps for decoding. To objectively validate the proposed DOLG-NeXt method, we perform extensive quantitative and qualitative analysis on four benchmark datasets from different biomedical image modalities: colonoscopy, electron microscopy, fluorescence, and retinal fundus imaging. DOLG-NeXt achieves a dice coefficient score of 95.10% in CVC-ClinicDB, 95.80% in ISBI 2012, 94.77% in 2018 Data Science Bowl, and 84.88% in the DRIVE dataset, respectively. The experimental analysis shows that DOLG-NeXt outperforms several state-of-the-art models for BMIS tasks.

Loading