Drawing the Line: A Dual Evaluation Approach for Shaping Ground Truth in Image Retrieval Using Rich Visual Embeddings of Historical ImagesDownload PDFOpen Website

Published: 01 Jan 2023, Last Modified: 06 Nov 2023HIP@ICDAR 2023Readers: Everyone
Abstract: Images contain rich visual information that can be interpreted in multiple ways, each of which may be correct. However, current retrieval systems in computer vision predominantly focus on content-based and instance-based image retrieval, while other facets relevant to the querying person, such as temporal aspects or image syntax, are often neglected. This study addresses this issue by examining a retrieval system in a domain-specific document processing pipeline. A retrieval evaluation dataset, which focuses on the aforementioned tasks, is utilized to compare different promising approaches. Subsequently, a qualitative study is conducted to compare the usability of the retrieval results with their corresponding metrics. This comparison reveals a discrepancy between the best-performing model by performance metrics and the model that provides better results for answering potential research questions. While current models such as DINO and CLIP demonstrate their ability to retrieve images based on their semantics and contents with high reliability, they exhibit limited capabilities in retrieving other facets.
0 Replies

Loading