Interactive Object Segmentation with Inside-Outside Guidance

Published: 05 Dec 2022, Last Modified: 14 Nov 2025OpenReview Archive Direct UploadEveryoneCC BY 4.0
Abstract: This work explores how to harvest precise object segmentation masks while minimizing the human interaction cost. To achieve this, we propose a simple yet effective interaction scheme, named Inside-Outside Guidance (IOG). Concretely, we leverage an inside point that is clicked near the object center and two outside points at the symmetrical corner locations (top-left and bottom-right or top-right and bottom-left) of an almost-tight bounding box that encloses the target object. The interaction results in a total of one foreground click and four background clicks for segmentation. The advantages of our IOG are four-fold: 1) the two outside points can help remove distractions from other objects or background; 2) the inside point can help eliminate the unrelated regions inside the bounding box; 3) the inside and outside points are easily identified, reducing the confusion raised by the state-of-the-art DEXTR [1] in labeling some extreme samples; 4) it naturally supports additional click annotations for further correction. Despite its simplicity, our IOG not only achieves state-of-the-art performance on several popular benchmarks such as GrabCut [2], PASCAL [3] and MS COCO [4], but also demonstrates strong generalization capability across different domains such as street scenes (Cityscapes [5]), aerial imagery (Rooftop [6] and Agriculture-Vision [7]) and medical images (ssTEM [8]). Code is available at https://github.com/shiyinzhang/Inside-Outside-Guidance.
Loading