CIRQRS: Evaluating Query Relevance Score in Composed Image Retrieval

27 Sept 2024 (modified: 05 Feb 2025)Submitted to ICLR 2025EveryoneRevisionsBibTeXCC BY 4.0
Keywords: composed image retrieval, evaluation metric, self-paced learning
Abstract: Composed Image Retrieval (CIR) retrieves relevant images using a reference image and accompanying text that describes how the desired images differ from the reference. However, the commonly used evaluation metric Recall@k only checks if the target image is retrieved, without considering the relevance of other images to the query, potentially leading to user dissatisfaction. We introduce Composed Image Retrieval Query Relevance Score (CIRQRS), an evaluation metric that scores each retrieved image based on its relevance to the query, offering a comprehensive evaluation. CIRQRS is trained using a reward model objective to prefer highly relevant, positive images over less relevant, negative ones. We propose a strategy motivated by self-paced learning to dynamically adjust the negative set based on the relevance of each image by using CIRQRS's current training status. To validate CIRQRS's ability to measure relevance, we created the human-scored FashionIQ (HS-FashionIQ) dataset and compared it with scores from human evaluators. CIRQRS correlates with human scores 2.625 times better than Recall@k, highlighting its superior ability to capture relevance. Additionally, by ranking images based on their CIRQRS, we check if the target image appears in the top k. The results show that CIRQRS achieves state-of-the-art performance on two representative CIR datasets, CIRR and FashionIQ.
Supplementary Material: zip
Primary Area: applications to computer vision, audio, language, and other modalities
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 9797
Loading