Abstract: Libraries contain enormous amounts of handwritten historical documents which cannot be made available on-line because they do not have a searchable index. The wordspotting idea has previously been proposed as a solution to creating indexes for such documents and collections by matching word images. In this paper we present an algorithm which compares whole word-images based on their appearance. This algorithm recovers correspondences of points of interest in two images, and then uses these correspondences to construct a similarity measure. This similarity measure can then be used to rank word-images in order of their closeness to a querying image. We achieved an average precision of 62.57% on a set of 2372 images of reasonable quality and an average precision of 15.49% on a set of 3262 images from documents of poor quality that are even hard to read for humans.
0 Replies
Loading