SAC UW-Net: A self-attention-based network for multimodal medical image segmentation

Published: 01 Jan 2024, Last Modified: 13 Nov 2024ISBI 2024EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: The use of multimodal medical imaging techniques has seen tremendous growth in research and clinical practice. Segmentation of these images is very important for diagnosing any abnormality in different organs of the body. Existing multimodal medical image segmentation models are mostly based on U-Net architecture. These architectures struggle with handling objects at different scales due to limited in their ability to handle spatially varying structures and are computationally expensive. Recently, self-attention (SA) has provided significant improvement in deep learning networks by allowing models to selectively focus on important parts of the input and modelling complex relationships in the data. In this work, we propose a novel segmentation model named SAC UW-Net. This introduces novel self-attention convolutional blocks in the decoder unit for better representation of feature vectors in multimodal medical images. Furthermore, to account for the variations of pixel intensities in different multimodal images, a transient self-attention block is also introduced in between the encoder and decoder units. The proposed model is tested on four benchmark datasets for medical image segmentation: Lits2017, BraTS2020, Data Science Bowl 2018, ISIC2018 Extensive experiments based on the performance metrics (dice similarity coefficient and mean IOU) as well as the number of floating point operations per second (FLOPS) shows that SAC UW-Net is a generalized, efficient model which achieves state-of-the-art results.
Loading

OpenReview is a long-term project to advance science through improved peer review with legal nonprofit status. We gratefully acknowledge the support of the OpenReview Sponsors. © 2025 OpenReview