Abstract: This paper proposes an algorithm for fast text line extraction in document image. Instead of binarization or multi-oriented Gaussian blurring of an image as in the conventional methods, we use integral image and design filters that are proper to detect text regions on the integral image. After the filtering, the center points in the regions are discovered by cascade text region verification followed by non-maximum suppression. Finally, text lines are extracted by grouping the points on the same line. The proposed method is tested with document images taken in various environments, and it is shown to be faster than the conventional ones while its performance is comparable.
0 Replies
Loading