Character-Level Street View Text Spotting Based on Deep Multisegmentation Network for Smarter Autonomous Driving

Abstract: Urban scenes are full of street entities with sign boards. Therefore, in autonomous driving, street view text spotting techniques will play a significant role in the precise understanding of surrounding scenes during driving, because texts contained in the images usually provide important clues for accurate image understanding, while it is often ambiguous for existing computer vision algorithms to understand scene images without texts. In this work, we propose a <bold xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">M</b> ulti- <bold xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">S</b> egmentation network for character-level scene <bold xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">T</b> ext <bold xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">D</b> etection (MSTD). The MSTD introduces a densely connected atrous spatial pyramid pooling module to enlarge the receptive field of the feature extraction layer, so as to localize long as well as large-sized text instances. Moreover, it devises a double segmentation subnetwork to utilize two independent but inherently complementary losses to co-optimize the network and increase the reliability of the confidence scores in predicting the text/nontext areas. With the character instances detected by the MSTD, one can easily perform scene text spotting with classic object recognition networks such as ResNet and DenseNet. We carried out extensive experiments on nine scene text datasets to demonstrate the outstanding performance of the MSTD on character-level and line-level text instance localization and scene text recognition, where the MSTD significantly outperforms the state-of-the-art scene text detection methods and the sequence-to-sequence-learning-based scene text recognizers.
0 Replies
Loading