FlowRefineSeg: Lightweight Segmentation of Holistic Surgical Scenes with Spatial and Temporal Refinement
Keywords: Lightweight Segmentation Model, Holistic Surgical Scene Segmentation, Pyramidal Convolution, Lightweight Attention
TL;DR: This paper proposes a lightweight holistic surgical segmentation framework incorporating temporal consistency for computationally efficient and high precision surgical video segmentation.
Abstract: Holistic surgical video segmentation is crucial for real-time applications such as proximity analysis of surgical components. For effective integration into clinical workflows, these models must deliver accurate and consistent outputs while being computationally efficient. However, current state-of-the-art (SOTA) architectures are complex, while lighter models fall short of baseline performance. Additionally, temporal consistency is often overlooked in existing surgical segmentation frameworks.
To address these limitations, this work introduces FlowRefineSeg, a lightweight segmentation model that achieves SOTA performance with low computational costs. It features a Linear Self-Attention module for effective low-level feature processing, a Gaussian Refinement block to enhance spatial coherence, and a Temporal Refinery module to ensure consistency across video frames.
Our experiments show that FlowRefineSeg achieves new benchmark performance on EndoVis18 (74% mIoU, 78% Dice) and SOTA performance on CholecSeg8k (75% mIoU, 80% Dice) with under 25M parameters, establishing a new standard for lightweight holistic surgical segmentation.
Primary Subject Area: Segmentation
Secondary Subject Area: Application: Endoscopy
Registration Requirement: Yes
Read CFP & Author Instructions: Yes
Originality Policy: Yes
Single-blind & Not Under Review Elsewhere: Yes
LLM Policy: Yes
Submission Number: 147
Loading