Differentiable Polygon Modeling for Object Instance Segmentation

Thomas Paniagua; Ryan Grainger; Tianfu Wu

Differentiable Polygon Modeling for Object Instance Segmentation

Thomas Paniagua, Ryan Grainger, Tianfu Wu

26 Sept 2024 (modified: 25 Nov 2024)ICLR 2025 Conference Withdrawn SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Differentiable Polygon Modeling, Object Instance Segmentation, PolygonAlign

TL;DR: A simple yet effective differentiable polygon model with state-of-the-art polygonal segmentation performance on MS-COCO instance segmentation

Abstract: Differentiable polygon (boundary-/contour-based) modeling for object instance segmentation remains an open problem in computer vision and deep learning. It also has been under-explored in the deep learning era, compared with its counterpart, bit-mask (region-based) modeling. In this paper, we present a method of differentiable polygon-based instance segmentation. As commonly done in the prior art, we assume a fixed topology, i.e., the number of vertices, $K$ is predefined and fixed (e.g., $K=250$) in learning and inference. We address two modeling problems: i) The alignment between a predicted $K$-vertex polygon and a target ground-truth $L$-vertex polygon in learning, where $L$ varies significantly. We present PolygonAlign similar in spirit to RoIAlign used in bit-mask-based instance segmentation, which enables using a simple $\ell_2$ norm as the vertex prediction loss function in learning. ii) The parameterization of a $K$-vertex polygon. We present a variant of the active contour model, which consists of a learnable contour initialization module and an one-step vertex-aware refinement/updating module. The initialization is learned via an affine transformation decoupled vertex regression method. A polygon is parameterized by a translation vector, a rotation transformation matrix, and the vertex displacement vectors. In experiments, the proposed method is tested on the MS-COCO 2017 benchmark using the Sparse R-CNN framework. It obtains state-of-the-art performance compared with the prior art of polygon modeling methods. We also show the empirical upper-bound performance of the proposed method is much higher than all existing instance segmentation methods, which encourages further research on differentiable polygon modeling.

Primary Area: applications to computer vision, audio, language, and other modalities

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.

Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.

Reciprocal Reviewing: I understand the reciprocal reviewing requirement as described on https://iclr.cc/Conferences/2025/CallForPapers. If none of the authors are registered as a reviewer, it may result in a desk rejection at the discretion of the program chairs. To request an exception, please complete this form at https://forms.gle/Huojr6VjkFxiQsUp6.

Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.

No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.

Submission Number: 8152

Loading