Enhancing scene text detectors with realistic text image synthesis using diffusion models

Ling Fu, Zijie Wu, Yingying Zhu, Yuliang Liu, Xiang Bai

Published: 2025, Last Modified: 05 Mar 2025Comput. Vis. Image Underst. 2025EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Highlights•We present DiffText, a pipeline that can produce realistic text images.•We introduce two strategies to enhance the credibility of the generated text.•We produce 10,000 realistic scene text images to help scene text detectors.