Polygon Pixel IoU: Similarity Metric between Polygons with Different Number of Vertices for Arbitrary-Shaped Text Spotting

Published: 2025, Last Modified: 13 Mar 2026ICASSP 2025EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Existing arbitrary-shaped text (ArT) spotting models have used coordinate-based error (CBE) loss to optimize polygon region detection. Optimization using CBE loss requires a fine-grained ground truth (FineGT), which is a fixed-length list of polygon vertices. However, not all datasets have FineGT, and creating FineGT is costly. In this paper, we propose Polygon Pixel Intersection over Union (PPIoU), which shows the similarity of polygon regions and can be used for neural network training without FineGT. Unlike CBE loss, PPIoU loss represents the region-based error for polygons and is not affected by the number of polygon vertices. PPIoU loss allows the use of multiple datasets containing polygons with different numbers of vertices, making it easy to increase the amount of training data. This is a useful feature since more training data generally improves accuracy. We show that the PPIoU loss can easily improve the accuracy of existing text spotting models.
Loading