STM-UNet: An Efficient U-shaped Architecture Based on Swin Transformer and Multiscale MLP for Medical Image Segmentation
Abstract: Automated medical image segmentation can assist doctors in diagnosing faster and more accurately. Deep learningbased medical image segmentation models have significantly progressed in recent years. However, the existing models fail to leverage Transformer and MLP to improve U-shaped architecture effectively and efficiently. In addition, the multiscale features of the MLP have not been fully extracted in the bottleneck of U-shaped architecture. In this paper, we propose an efficient U-shaped architecture based on Swin Transformer and multiscale MLP, namely STM-UNet. Specifically, the Swin Transformer blocks are added to skip connection of STM-UNet in the form of residual connection, which can enhance the modeling ability of global features and long-range dependency. Meanwhile, a novel PCAS-MLP module with parallel convolution is designed and placed into the bottleneck of our proposed architecture to improve the segmentation performance. The experimental results on ISIC 2016 and ISIC 2018 demonstrate the effectiveness of our proposed method. Our method also outperforms several state-of-the-art methods in terms of IoU and Dice. Our method has achieved a better tradeoff between high segmentation accuracy and low model complexity.
Loading