DSText V2: A comprehensive video text spotting dataset for dense and small text

Weijia Wu, Yiming Zhang, Yefei He, Luoming Zhang, Zhenyu Lou, Hong Zhou, Xiang Bai

Published: 2024, Last Modified: 14 May 2025Pattern Recognit. 2024EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Highlights•Additional video data: we collect and annotate a high-quality, high-resolution additional 40 videos to the training set of DSText V1 for expanding the size and diversity of the dataset. And provide a well-maintained benchmark page with corresponding links for dataset download: DSText.•Data Comparison and Analysis: We provide comprehensive data comparisons and analyses, including but not limited to the distribution of average text area for 7 open scenarios (Figure 2 (d)), the distribution of average text number per frame (density) for 7 open scenarios (Figure 2 (e)), and the distribution of frames for different text numbers (Figure 5).•Comprehensive experimental analysis: We provide newly introduced comprehensive experimental analysis, and additional insights into the unique challenges (small and dense text challenges) of our dataset, enabling future researchers to better understand and leverage its potential.