Two-step sequence transformer based method for Cham to Latin script transliterationOpen Website

Published: 01 Jan 2023, Last Modified: 11 Nov 2023HIP@ICDAR 2023Readers: Everyone
Abstract: Fusion information between visual and textual information is an interesting way to better represent the features. In this work, we propose a method for the text line transliteration of Cham manuscripts by combining visual and textual modality. Instead of using a standard approach that directly recognizes the words in the image, we split the problem into two steps. Firstly, we propose a scenario for recognition where similar characters are considered as unique characters, then we use the transformer model which considers both visual and context information to adjust the prediction when it concerns similar characters to be able to distinguish them. Based on this two-step strategy, the proposed method consists of a sequence to sequence model and a multi-modal transformer. Thus, we can take advantage of both the sequence-to-sequence model and the transformer model. Extensive experiments prove that the proposed method outperforms the approaches of the literature on our Cham manuscripts dataset.
0 Replies

Loading