Semantic-Adversarial Graph Convolutional Network for Zero-Shot Cross-Modal RetrievalOpen Website

Published: 2022, Last Modified: 17 May 2023PRICAI (2) 2022Readers: Everyone
Abstract: Traditional cross-modal retrieval (CMR) methods assume that training data holds all the categories appearing in retrieval stage. However, when some multimodal data of new categories come, the learned model may achieve disappointing performance. Based on the theory of zero-shot learning, zero-shot cross-modal retrieval (ZS-CMR) emerges to solve this problem and becomes a new research topic. Existing ZS-CMR methods have the following limitations. (1) The semantic association between seen and unseen categories is important but ignored. Therefore, the semantic knowledge cannot be fully transferred from seen classes to unseen classes. (2) The cross-modal representations are not semantically aligned. Thus, samples of new categories cannot obtain semantic representations, further leading to unsatisfactory retrieval results. To tackle the above problems, this paper proposed the semantic-adversarial graph convolutional network (SAGCN) for ZS-CMR. Specifically, graph convolutional network is introduced to mine the potential relationship between categories. Besides, the techniques of adversarial learning and semantic similarity reconstruction are utilized to learn a common space, where multimodal embedding and class embedding are semantically fused. Finally, a shared classifier is adopted to enhance the discriminant ability of the common space. Experiments on three data sets illustrated the effectiveness of SAGCN on both traditional CMR and ZS-CMR tasks.
0 Replies

Loading