Generative Data Augmentation for Arabic Handwritten Digit Recognition Boosting Real-time OCR Capabilities

Published: 01 Jan 2023, Last Modified: 04 Mar 2025AIPR 2023EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: This study assesses the effectiveness of Generative Adversarial Networks (GANs) and Variational Autoencoders (VAEs) in enhancing Optical Character Recognition (OCR) accuracy for Arabic handwritten digits, an area with limited research and significant challenges arising from character similarities. By employing data augmentation to generate varied images, we aim to address the issues OCR systems face with low-quality or distorted inputs. We introduce a novel procedure to calculate the Fréchet Inception Distance (FID), to accurately measure the quality of synthetic images during training, facilitating early stopping to optimize model performance. Additionally, using Saliency Maps allows for detailed analysis of OCR improvements. Our findings highlight the potential of synthetic data in advancing real-time OCR systems, with our evaluation procedure offering a faster, more accurate alternative to traditional quality metrics.
Loading