Dual-Path Semantic Construction Network for Composed Query-Based Image RetrievalOpen Website

Published: 01 Jan 2023, Last Modified: 01 Nov 2023ICMR 2023Readers: Everyone
Abstract: Composed Query-Based Image Retrieval (CQBIR) aims to retrieve the most relevant image from all the candidates according to the composed query. However, the multi-model query brings more challenges to learning the proper semantics, which include the traits mentioned in the text and resemblance with reference images. The improper learned semantics reduced the performance of existing CQBIR methods. To this end, we propose a novel framework termed Dual-Path Semantic Construction Network for Composed Query-Based Image Retrieval (DSCN). It consists of three components: (1) Multi-level Feature Extraction obtains the textual and visual features of various hierarchies for learning multi-level semantics. (2) Visual-to-Textual Semantic Construction module refines the learned semantics at the textual level. (3) Textual-to-Visual Semantic Construction module performs semantic guidance in the visual semantic space. Extensive experiments on three benchmarks, i.e., FashionIQ, Shoes, and Fashion200k show that our DSCN method outperforms recent state-of-the-art methods.
0 Replies

Loading