DUMFNet: Enhanced Medical Image Segmentation with Multi-visual Encoding and Local Scanning

Yuxuan Luo, Chong Peng, Hongyu Wang, Pan Zhou, Liejun Wang, Panpan Zheng

Published: 01 Jan 2024, Last Modified: 06 Feb 2025BIBM 2024EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: With the Mamba framework widely applied, the state-space model has yielded outstanding results in the field of computer vision. Nevertheless, the superiority of the model to its counterparts, namely, CNN-based or Transformer-based models, is limited, because it faces large challenges in local region feature extraction resulting from the deficient positional awareness and the disproportional emphasis on posterior tokens in its pre-defined scanning schedules. Besides, a majority of the current Mamba-based models usually fail to take into account the advantages of the integration of multi-visual encoding strategies. Against this background, we propose a novel DoubleU-Net framework with multiple visual encoding strategies and a local-based scanning mechanism. The comparative & ablation experiments with the current SOTA methods verify the superiority or competitiveness of the DUMFNet. For reproduction, the implementation codes can be checked out at https://github.com/Panpz202006/DUMFNet.