Abstract: Many two-stage instance segmentation heads predict a coarse 28x28 mask per instance, which is insufficient to capture the fine-grained details of many objects. To address this issue, PointRend and RefineMask predict a 112x112 segmentation mask resulting in higher quality segmentations. However, both methods have limitations by either not having access to neighboring features (PointRend) or by performing computation at all spatial locations instead of sparsely (RefineMask). In this work, we propose EffSeg performing fine-grained instance segmentation in an efficient way by using our Structure-Preserving Sparsity (SPS) method based on separately storing the active features, the passive features, and a dense 2D index map containing the feature indices. The goal of the index map is to preserve the 2D spatial configuration or structure between the features such that any 2D operation can still be performed. EffSeg achieves similar performance on COCO compared to RefineMask, while reducing the number of FLOPs by 71% and increasing the FPS by 29%. Code will be released.
Submission Length: Regular submission (no more than 12 pages of main content)
Assigned Action Editor: ~David_Fouhey2
Submission Number: 1812
Loading