Automatic Generation of Structured Hyperdocuments from Multi-Column Document ImagesDownload PDFOpen Website

2000 (modified: 05 Nov 2022)ICPR 2000Readers: Everyone
Abstract: We propose two methods for converting complex multi-column document images into HTML documents, and a method for generating a structured table of contents (ToC) page based on the logical structure analysis of the document image. Experiments with various kinds of multi-column document images show that HTML documents corresponding to the paper documents can be generated in a visual layout, and that their structured table of contents page, with the hierarchically ordered section titles hyperlinked to the contents, can be also produced by the proposed methods.
0 Replies

Loading