TPD-STR: Text Polygon Detection with Split Transformers

Published: 27 Feb 2025, Last Modified: 18 Oct 2025WACV 2025EveryoneCC BY-NC 4.0
Abstract: Regressing text in natural scenes with polygonal representations is challenging due to shape prediction difficulties. To address this, we introduce Text Polygon Detection with Split Transformers (TPD-STR), which directly regresses polygonal points. TPD-STR incorporates the Decoder Split (DS) architecture to separate polygonal point regression and textness classification, and the Positional Information Propagation (PIP) module to enhance classification. Both modules are effective and compatible with existing methods. TPD-STR achieves state-of-the-art (SOTA) performance among regression-based methods, surpassing segmentation-based methods on MSRA-TD500 without external data. Adding DS and PIP to existing models further improves performance. Experiments demonstrate the model's ability to detect text instances effectively.
Loading