Abstract: In this paper, a novel multi-task cascade framework, which jointly takes the detection and the segmentation into account, is presented for the scene text detection. To address the issue of multi-oriented scene text detection, we propose an instance-level mask approximation method through the auxiliary regression task on center and corner points. Specifically, the text instance in the image is first coarsely detected, followed by a contextual module which can capture more accurate instances. To cope with the scale variation existing in these detected instances, a combination of high-level semantic and low-level features is further exploited, achieving more robust and better performance. A series of experiments conducted on different benchmark datasets demonstrate the effectiveness of the proposed method.
0 Replies
Loading