Abstract: Text detection in natural scenes has evolved considerably in recent years. Segmentation-based methods are widely used for text detection because they are robust to detect text of any shape. However, most previous works focus on word-level detection and neglect the regions between adjacent words, which are helpful when some text instances are very close. In this paper, we propose a novel image feature named affinity area that exploits the area between two adjacent text instances to enhance the detection capability. We design an affinity module to generate annotations based on existing word-level annotations since no open dataset supports that. By optimizing this module, our segmentation-based network TDAE can predict text regions and affinity regions through which we can obtain the final detection results. Inspired by the evolutionary strategy (ES), our network also utilizes an additional novel fine-tuning step to update the parameters by adding adaptive but random perturbations, which is quite different from the traditional gradient descent approach. Competitive results on ICDAR (2013, 2015, 2017), CTW-1500, and SynthText benchmarks further demonstrate the effectiveness of TDAE.
Loading