Vision State Space Duality for Medical Image Segmentation: Enhancing Precision through Non-Causal Modeling

26 Sept 2024 (modified: 05 Feb 2025)Submitted to ICLR 2025EveryoneRevisionsBibTeXCC BY 4.0
Keywords: State Space Models, Vision State Space Duality, Medical Image Segmentation, self-attention, skip connections
TL;DR: We introduces VSSD-UNet, a novel medical image segmentation model that integrates Vision State Space Duality (VSSD) within a UNet-like architecture, demonstrating superior performance over traditional UNet variants.
Abstract: In medical image analysis, Convolutional Neural Networks (CNNs) and Vision Transformers (ViTs) have set significant benchmarks. However, CNNs exhibit limitations in long-range modeling capabilities, whereas Transformers are hampered by their quadratic computational complexity. Recently, State Space Models (SSMs) have gained prominence in vision tasks as they offer linear computational complexity. State Space Duality (SSD), an improved variant of SSMs, was introduced in Mamba2 to enhance model performance and efficiency. Inspired by this, we have tailored the Vision State Space Duality (VSSD) model for medical image segmentation tasks by integrating it within a UNet-like architecture, which is renowned for its effectiveness in the field. Our modified model, named VSSD-UNet, employs skip connections to preserve spatial information and utilizes a series of VSSD blocks for feature extraction. In addition, VSSD-UNet employs a hybrid structure of VSSD and self-attention in the decoder part, ensuring that both local details and global contexts are captured. Finally, we conducted comparative and ablation experiments on two public lesion segmentation datasets: ISIC2017 and ISIC2018. The results show that VSSD-UNet outperforms several types of UNet in medical image segmentation under the same hyper-parameter setting. Our code will be released soon.
Primary Area: applications to computer vision, audio, language, and other modalities
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 7023
Loading

OpenReview is a long-term project to advance science through improved peer review with legal nonprofit status. We gratefully acknowledge the support of the OpenReview Sponsors. © 2025 OpenReview