Smile: Sequence-to-Sequence Domain Adaptation with Minimizing Latent Entropy for Text Image Recognition

Yen-Cheng Chang, Yi-Chang Chen, Yu-Chuan Chang, Yi-Ren Yeh

Published: 01 Jan 2022, Last Modified: 14 Nov 2024ICIP 2022EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Excellent text recognition results have been obtained by training recognition models with synthetic images. However, recognizing text from real-world images still faces challenges due to the domain shift between synthetic and real-world text images. One strategy to eliminate this domain difference without manual annotation is unsupervised domain adaptation (UDA). Due to the characteristics of sequential labeling tasks, most popular UDA methods cannot be directly applied to text recognition. To tackle this problem, we proposed a UDA method that minimizes latent entropy on sequence-to-sequence attention-based models with class-balanced self-paced learning. Experimental results show that our proposed framework achieves better recognition results than the existing methods on most UDA text recognition benchmarks. All codes are publicly available 1 .