Abstract: Automatic recognition and error correction of texts from images are critical for many commercial applications such as receipt recognition, which have very high accuracy requirements. In this paper we propose an integrated image based text recognition and correction approach to improve accuracy. There are two levels of text recognition and correction integration in the proposed approach. Firstly, a beam search strategy is designed to generate a set of text candidates, based on the probability distribution of text prediction outcomes from a deep learning recognition model. Then a word-level lexicon check is applied to select only one from the candidate text sentences, which has the highest prediction probability among those with all words present in the lexicon. Jointly the beam search and lexicon check can effectively correct some recognition errors. Secondly, an encoder-decoder language model based corrector is developed to correct potential recognition errors in the selected output texts that fail the lexicon check. Training samples for the corrector are created from the recognition outcomes and can be expanded by associating multiple text candidates with one image label. We conduct experiments on ICDAR'13 and CH10K datasets to evaluate the proposed approach and the impact of these two levels of integration on accuracy. Experiment results show that the proposed approach outperforms the existing one with higher recall and much higher recognition accuracy through effective exploitation of joint recognition and correction design.
0 Replies
Loading