Abstract: Mathematical expression recognition is a research field that aims to develop algorithms and systems capable of interpreting mathematical content. The recognition of MEs requires handling two-dimensional symbol relationships such as sub/superscripts, matrices and nested fractions, among others. The prevalent technology for addressing these challenges are based on encoder-decoder architectures with attention models. In this paper we propose the Py4MER system, based on Convolutional Recurrent Neural Network (CRNN) models, for the recognition and transcription of MEs into LaTeX mark-up sequences. This model is proposed as an alternative to encoder-decoder approaches, as CRNN models trained through Connectionist Temporal Classification (CTC) implicitly model the dependencies between symbols and do not suffer from under/over parsing of the input image, generating more consistent mark-up. The proposed model is evaluated on the Im2Latex-100k data set based on both textual and image-level metrics, showing a remarkable improvement from other CTC-based approaches. Recognition results are analyzed for different ME lengths and ME structures. Furthermore, a study based on the edit distance is performed, showing a considerable improvement in precision when up to 5 edit operations are considered. Finally, we show that CTC-based CRNN models can adapt to non left-to-right ordering of ME elements, warranting more research for this approach.
0 Replies
Loading