Image retrieval with mixed initiative and multimodal feedback

Nils Murrugarra-Llerena, Adriana Kovashka

2021 (modified: 14 Dec 2021)Comput. Vis. Image Underst. 2021Readers: Everyone

Abstract: Highlights • We propose a mixed-initiative framework for image retrieval. • Users can initiate or the system can request feedback, depending on information gain. • Reinforcement learning enables an agent to choose useful feedback interactions. • Intelligently choosing the type of feedback is superior to interleaving interactions. • The mixed-initiative framework is superior to the four individual interactions. Abstract How would you search for a unique, flamboyant shoe that a friend wore and you want to buy? What if you did not take a picture? Existing approaches propose interactive image search, but they either entrust the user with taking the initiative to provide informative feedback, or give all control to the system which determines informative questions to ask. Instead, we propose a mixed-initiative framework where both the user and system can be active participants, depending on whose input will be more beneficial for obtaining high-quality search results. We develop a reinforcement learning approach which dynamically decides which of four interaction opportunities to give to the user: drawing a sketch, marking images as relevant or not, providing free-form attribute feedback, or answering attribute-based questions. By allowing these four options, our system optimizes both the informativeness of feedback, and the ability of the user to explore the data, allowing faster image retrieval. We outperform five baselines on three datasets under extensive settings.

0 Replies