Transfer Learning Methods for Extracting, Classifying and Searching Large Collections of Historical Images and Their Captions

Published: 01 Jan 2020, Last Modified: 07 Nov 2025ICPR Workshops (7) 2020EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: This paper is about the creation of an interactive software tool and dataset useful for exploring the unindexed 11-volume set, Pompei: Pitture e Mosaici (PPM), a valuable resource containing over 20,000 annotated historical images of the archaeological site of Pompeii, Italy. The tool includes functionalities such as a word search, and an images and captions similarity search. Searches for similarity are conducted using transfer learning on the data retrieved from the scanned version of PPM. Image processing, convolutional neural networks and natural language processing also had to come into play to extract, classify, and archive the text and image data from the digitized version of the books.
Loading