A Length-Sensitive Language-Bound Recognition Network for Multilingual Text Recognition

Published: 01 Jan 2023, Last Modified: 13 Nov 2024MMM (2) 2023EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Due to the widespread use of English, considerable attention has been paid to scene text recognition with English as the target language, rather than multilingual scene text recognition. However, it is increasingly necessary to recognize multilingual texts with the continuous advancement of global integration. In this paper, a Length-sensitive Language-bound Recognition Network (LLRN) is proposed for multilingual text recognition. LLRN follows the traditional encoder-decoder structure. We improve the encoder and decoder respectively to better adapt to multilingual text recognition. On the one hand, we propose a Length-sensitive Encoder (LE) to encode features of different scales for long-text images and short-text images respectively. On the other hand, we present a Language-bound Decoder (LD). LD leverages language prior information to constrain the original output of the decoder to further modify the recognition results. Moreover, to solve the problem of multilingual data imbalance, we propose a Language-balanced Data Augmentation (LDA) approach. Experiments show that our method outperforms English-oriented mainstream models and achieves state-of-the-art results on MLT-2019 multilingual recognition benchmark.
Loading