MEDSPIKEFORMER: All Neurons Matter for Medical Image Segmentation

14 Sept 2025 (modified: 19 Nov 2025)ICLR 2026 Conference Withdrawn SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Medical image segmentation, Spiking self-attention, Self-attention, Spike neural network
Abstract: Spiking self-attention (SSA) has emerged as a promising approach for medical image segmentation due to its event-driven and energy-efficient nature. However, segmentation performance still degrades in complex scenarios where salient and non-salient regions coexist. Two fundamental issues remain: i) existing SSA mechanisms rely only on activated neurons, overlooking the contextual cues carried by inactivated neurons, and ii) the binary spike representation causes distribution distortions that make spiking self-attention lag behind their ANN-based self-attention (SA) in spatial discriminability. To overcome these challenges, we propose MedSpikeFormer, a spiking transformer built on the principle that all neurons matter, both activated and inactivated. MedSpikeFormer introduces a Spike-based Decomposed Self-Attention (SDSA) that explicitly models four types of neuronal interactions: activated–activated, activated–inactivated, inactivated–activated, and inactivated–inactivated, thus recovering rich contextual dependencies ignored by conventional SSA. Furthermore, we employ a distribution alignment loss that minimizes the divergence between SDSA and ANN-based self-attention (SA), significantly closing the performance gap to improve spatial feature discriminability while maintaining the binary nature of spiking neural networks. Extensive experiments on five medical segmentation benchmarks demonstrate that MedSpikeFormer consistently outperforms 14 state-of-the-art methods, achieving up to +2.4% mIoU on ISIC2018 and +8.7% on COVID-19. These results confirm that leveraging both fired and non-fired neurons is crucial for robust spike-driven medical image segmentation. Code is available at https://github.com/AnonymousPaper2026/MedSpikeFormer.
Supplementary Material: pdf
Primary Area: applications to computer vision, audio, language, and other modalities
Submission Number: 5036
Loading