GridMask: An Efficient Scheme for Real Time Curved Scene Text Detection

Published: 01 Jan 2024, Last Modified: 13 Nov 2024PRCV (7) 2024EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Real-time curved scene text detection remains challenging due to various background and diverse text shapes. Existing methods, which often predict at full scale, are time-consuming, whilst low scale methods are not able to handle texts in complex scene. To resolve this problem, we propose a new quarter-scale detection scheme, named GridMask. GridMask models a 4 \(\times \) 4 pixels block efficiently and avoids post-processing. It formulates text detection as a grid classification and regression task, enabling fast execution. A comprehensive set of experiments on the curved and multi-orientation texts from four datasets, including ICDAR 2015, CTW1500, Total Text and MSRA-TD500, demonstrate that GridMask achieves state-of-the-art execution speed in scene text detection. GridMask also achieves state-of-the-art accuracy on the CTW1500 and Total Text datasets, which implies that GridMask is superior to prior studies from both perspectives. The source code and the trained model is available.
Loading