Graph-Based Multi-Modal Light-weight Network for Adaptive Edge MRI Tumor Segmentation

Graph-Based Multi-Modal Light-weight Network for Adaptive Edge MRI Tumor Segmentation

ICLR 2026 Conference Submission17486 Authors

19 Sept 2025 (modified: 08 Oct 2025)ICLR 2026 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Brain Tumor Segmentation, Lightweight Network, Multi-Modal Interaction

TL;DR: We present EdgeIMLocSys and GMLN-BTS, GMLN-BTS is a lightweight graph-based multi‑modal 3D segmentation model (4.58M params) with a modality‑aware encoder, graph cross‑modal interaction, and a voxel refinement upsampling module.

Abstract: Brain tumor segmentation plays a critical role in clinical diagnosis and treatment planning, yet the variability in imaging quality across different MRI scanners presents significant challenges to model generalization. To address this, we propose the Edge Iterative MRI Lesion Localization System (EdgeIMLocSys), which integrates Continuous Learning from Human Feedback to adaptively fine-tune segmentation models based on clinician feedback, thereby enhancing robustness to scanner-specific imaging characteristics. Central to this system is the Graph-based Multi-Modal Interaction Lightweight Network for Brain Tumor Segmentation (GMLN-BTS), which employs a Modality-Aware Adaptive Encoder (M2AE) to extract multi-scale semantic features efficiently, and a Graph-based Multi-Modal Collaborative Interaction Module (G2MCIM) to model complementary cross-modal relationships via graph structures. Additionally, we introduce a novel Voxel Refinement UpSampling Module (VRUM) that synergistically combines linear interpolation and multi-scale transposed convolutions to suppress artifacts while preserving high-frequency details, improving segmentation boundary accuracy. Our proposed GMLN-BTS model achieves state-of-the-art (SOTA) performance on both the BraTS2017 and BraTS2021 datasets among lightweight models with only 4.58 million parameters, representing a 98% reduction compared to mainstream 3D Transformer models, and significantly outperforms existing lightweight approaches. This work demonstrates a synergistic breakthrough in achieving high-accuracy, resource-efficient brain tumor segmentation suitable for deployment in resource-constrained clinical environments.

Primary Area: applications to computer vision, audio, language, and other modalities

Submission Number: 17486

Loading