XLSTM-HVED: Cross-Modal Brain Tumor Segmentation and MRI Reconstruction Method Using Vision Xlstm and Heteromodal Variational Encoder-Decoder
Abstract: Neurogliomas are among the most aggressive forms of can-cer, presenting considerable challenges in both treatment and monitoring due to their unpredictable biological behav-ior. Magnetic resonance imaging (MRI) is currently the preferred method for diagnosing and monitoring gliomas. However, the lack of specific imaging techniques often com-promises the accuracy of tumor segmentation during the imaging process. To address this issue, we introduce the XLSTM-HVED model. This model integrates a hetero-modal encoder-decoder framework with the Vision XLSTM module to reconstruct missing MRI modalities. By deeply fusing spa-tial and temporal features, it enhances tumor segmentation performance. The key innovation of our approach is the Self-Attention Variational Encoder (SAVE) module, which im-proves the integration of modal features. Additionally, it op-timizes the interaction of features between segmentation and reconstruction tasks through the Squeeze-Fusion-Excitation Cross Awareness (SFECA) module. Our experiments using the BraTS 2024 dataset demonstrate that our model signif-icantly outperforms existing advanced methods in handling cases where modalities are missing. Our source code is available at https://github.com/Quanat0607/XLSTM-HVED.
Loading