Abstract: In recent years, deep learning based techniques have been successfully applied to medical image segmentation, which plays an important role in intelligent lesion analysis and disease diagnosis. At present, the mainstream segmentation models are primarily based on the U-Net model for extracting local features through multi-layer convolution, which lacks global information and the multi-scale semantic information interaction between the Encoder and Decoder process, leading to sub-optimal segmentation performance. To address such issues, in this work we propose a new medical image segmentation network, namely SACA-UNet, which improves the U-Net model via the self-attention and cross atrous spatial pyramid pooling (Cross-ASPP) mechanisms. In specific, SACA-UNet first utilizes the self- attention mechanism to capture the global feature, it next devises a Cross-ASPP module to extract and fuse features of varying reception fields to prompt multi-scale semantic interaction. We evaluate the segmentation performance of our proposed model on four benchmark datasets including the ISIC2018, BUSI, CVC- ClinicDB, and COVID-19 datasets, in terms of both the Dice coefficient and IoU metrics. Experimental results demonstrate that SACA-UNet remarkably outperforms the baseline methods.
Loading