Handwritten Text Generation via Disentangled Representations

Xiyan Liu, Gaofeng Meng, Shiming Xiang, Chunhong Pan

Published: 2021, Last Modified: 03 Feb 2025IEEE Signal Process. Lett. 2021EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Automatically generating handwritten text images is a challenging task due to the diverse handwriting styles and the irregular writing in natural scenes. In this paper, we propose an effective generative model called HTG-GAN to synthesize handwritten text images from latent prior. Unlike single-character synthesis, our method is capable of generating images of sequence characters with arbitrary length, which pays more attention to the structural relationship between characters. We model the structural relationship as the style representation to avoid explicitly modeling the stroke layout. Specifically, the text image is disentangled into style representation and content representation, where the style representation is mapped into Gaussian distribution and the content representation is embedded using character index. In this way, our model can generate new handwritten text images with specified contents and various styles to perform data augmentation, thereby boosting handwritten text recognition (HTR). Experimental results show that our method achieves state-of-the-art performance in handwritten text generation.