Search in Archival Facsimile Documents for Digital HistoryDownload PDFOpen Website

Published: 01 Jan 2023, Last Modified: 29 Feb 2024e-Science 2023Readers: Everyone
Abstract: Recent advances in text digitization and processing have opened up many possibilities for historical archives to be processed and digitized in an efficient and automated manner. Processing steps, also involving language detection, optical character recognition (OCR), named entity recognition (NER), recognition error detection, and automated or manual correction can result in digitized archives providing both high-quality facsimile representations of original document scans and extracted text metadata close to the original text in a machine-friendly format. Exploration of digitally enhanced archives is an important step forward in the future workflow of archivists and historians alike. After analysing the requirements of these users, we propose a concept for dynamically generating retrieval-relevant facsimile image snippets. This work demonstrates a Human-in-the-Loop retrieval and research workflow based on these methods by providing a search user interface prototype geared towards intuitively exploring topics across a multilingual historical facsimile archive corpus.
0 Replies

Loading