Keywords: Maha, Radbot, Segmentation, Brain Tumor
Abstract: Brain tumor segmentation and survivability prediction are crucial in neuro-oncology, directly impacting clinical decision-making and patient management. However, traditional deep learning-based segmentation approaches often lack flexibility, interpretability, and adaptability to user-driven corrections, limiting their clinical utility. To overcome these challenges, we introduce RadBot, a novel vision-language model (VLM)-powered framework that integrates tumor segmentation, interpretative analysis, and survivability prediction into a unified, interactive pipeline. Moreover, prompt-based VLMs, such as CLIPSeg, exhibit sensitivity to linguistic variations inherent in English prompts, which often fail to span the full vector embedding range for nuanced tumor morphologies. To mitigate prompt sensitivity without retraining, we introduce model agnostic hybrid augmentation (MAHA), an inference-time prompt ensemble method for brain tumor analysis. To improve interpretability, we incorporate LLaVA, a multimodal large language model (MMLLM), enabling interactive question-answering for tumor analysis. Additionally, the RadBot Mask Editor provides an interactive refinement tool, allowing radiologists to manually correct segmentation errors through brushing and unbrushing tools, ensuring clinically precise results. For survivability prediction, RadBot integrates LLaVA-based analysis of MRI and clinical data for efficient prognosis estimation and decision-support. We validate proposed RadBot+MAHA on BraTS 2020 and 2021 datasets, achieving SOTA segmentation performance. Our findings demonstrate that integrating VLMs and MMLLMs enhances segmentation accuracy, interpretability, and clinical relevance. RadBot bridges the gap between automated segmentation and expert-driven analysis, establishing a new paradigm for AI-assisted workflows.
Supplementary Material: pdf
Primary Area: foundation or frontier models, including LLMs
Submission Number: 20482
Loading