Abstract: Neural network-based approaches have taken the lead in medical image segmentation with the encoder-decoder architecture. However, these approaches are still limited to one neural structure, which is short in leveraging the strengths of the three dominant structures (Convolutional Neural Network, Transformer, and Multilayer Perceptron) simultaneously. Furthermore, simple skip connections cannot effectively bridge the semantic gap between the encoder and decoder at the same level. To alleviate the above problems, this paper proposes Mixed-Net,haode which cleverly formulates a strategy to synergize three neural network structures for medical image segmentation. Specifically, our method innovatively designs two components, namely a Semantic Gap Bridging Module (SGBM) and a Global Information Compensation Decoder (GICD). Convolution-based SGBM can validly expand the receptive field and combine shallow and high-level representations by replacing original skip connections. Equally importantly, we present a GICD containing convolution and transformer, which can adequately incorporate local refinement features and global representations in the information decoding space. We evaluate Mixed-Net on 3 different medical image segmentation datasets. Surprisingly, our method sets the new state-of-the-art performance and demonstrates stronger generalization capability.
0 Replies
Loading