Markov Random Field Based Text Identification from Annotated Machine Printed DocumentsDownload PDFOpen Website

Published: 2009, Last Modified: 10 Nov 2023ICDAR 2009Readers: Everyone
Abstract: In this paper, we describe an approach to segment handwritten text, machine printed text and noise from annotated machine printed documents. Three categories of word level features are extracted. We use a modified K-Means clustering algorithm for classification followed by a relabeling procedure using Markov Random Field(MRF) based on a concept of neighboring patches and Belief Propagation(BP) rules. Experimental results on an imbalanced data set show that our approach achieves an overall recall of 96.33%.
0 Replies

Loading