Exquisitor at the Video Browser Showdown 2026: Temporal Queries Revisited

Omar Shahbaz Khan, Ujjwal Sharma, Gonçalo Marcelino, Stevan Rudinac, Björn Þór Jónsson

Published: 2026, Last Modified: 07 Apr 2026MMM (4) 2026EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: In today’s data-rich world, multimedia content is produced at unprecedented rates, creating challenges for building systems that support evolving and often unknown information needs. Competitions such as the Video Browser Showdown (VBS) and the Lifelog Search Challenge (LSC) push researchers to develop systems that assist users in complex retrieval and analytical tasks. Exquisitor is an experimental, scalable multimedia retrieval system that integrates conversational search, relevance feedback, and metadata-based filtering to support exploratory and analytical search across large collections. Its development has been guided by the insights gained from participating in previous editions of VBS and LSC, evolving from a pure relevance feedback system to one that incorporates conversational interaction powered by large language models (LLMs). In this paper, we address two drawbacks observed in VBS 2025. First, tasks with temporal components highlighted the need for improved temporal querying. To address this, we propose a novel sequence-chain method combined with reciprocal rank fusion (RRF). Second, to enhance performance in question-answering tasks, we introduced in-video search to facilitate rapid content understanding.
Loading