OCFormer: A Transformer-Based Model For Arabic Handwritten Text Recognition‏

Aly M. Kassem

28 Jun 2023OpenReview Archive Direct UploadReaders: Everyone

Abstract: The Optical Character Recognition (OCR) of Arabic historical documents is a challenging task. The reason being the complexity of the layout and the highly variant typography. Nonetheless, in recent years, with the rise of Deep learning, significant progress has been made in historical OCR; in both layout recognition and segmentation, and also in character recognition. The only downside is the limited advancements dedicated to the Arabic language, notably the handwritten text. In this paper, we present an OCR approach that utilizes state-of-theart Deep learning techniques for the Arabic language. We built a custom dataset of obfuscated and noisy images to imitate the noise in historical Arabic documents, with a collection of 30 million images paired with their ground truth. The model utilizes both page segmentation and line segmentation techniques to enhance the resultant transcription. The model is complex enough for transcribing handwritten manuscripts. In addition, the model can detect and transcribe documents that contain Arabic diacritics. The model attained a CER of 0.0727, a WER of 0.0829, and a SER of 0.10.

0 Replies