HybridVPS: Hybrid-Supervised Video Polyp Segmentation Under Low-Cost Labels

Published: 01 Jan 2024, Last Modified: 26 Oct 2024IEEE Signal Process. Lett. 2024EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Deep polyp segmentation methods have shown remarkable potential in boosting diagnostic efficiency. Nevertheless, these methods rely on sufficient pixel-wise annotated data, which is time-consuming and labor-intensive to acquire in clinical practice. This challenge is further escalated under the polyp segmentation scenario due to the massive video frames. To alleviate annotating burden, in this letter, we propose a label-efficient polyp segmentation framework named HybridVPS, which drastically reduces the annotation cost while maintaining satisfactory performance. Our core insight is to take full advantage of the similar semantics between consecutive video frames. Specifically, only a few frames require pixel-wise annotations, while the cheap scribble annotations are enough for the remaining part. To fully leverage the coarse location information provided by scribble annotations, we introduce an adaptive label prompter, which utilizes pixel-wise annotation to provide reliable guidance for scribble-annotated neighboring frames, thus facilitating the overall accuracy of the segmentation. Extensive experiments on the large-scale video polyp dataset SUN-SEG demonstrate the superiority of our approach. HybridVPS achieves comparable performance to the fully supervised scheme while requiring only 2% of the pixel-level annotations.
Loading