Fast Data Augmentation for Scene Text Recognition Using CUDA

David Angelo Piscasio, Rowel Atienza

Published: 2023, Last Modified: 28 Feb 2024TENCON 2023Readers: Everyone

Abstract: Scene Text Recognition (STR) is a task in computer vision that is used to read texts in natural scene images. STR currently suffers from data distribution shift due to the lack of large real datasets for training. Data augmentation is a method that has been used in multiple studies to address this issue. However, performing augmentation also introduces computational overhead during training. In this paper, we propose FastSTRAug, a CUDA-based library of 36 augmentation functions specifically designed for STR. When executed through varying image sizes, FastSTRAug is observed to be significantly faster over its serial counterpart in most functions, reaching up to 380x speedup on larger images.

0 Replies