Multi-Graph Based Hierarchical Semantic Fusion for Cross-Modal Representation

Lei Zhu, Chengyuan Zhang, Jiayu Song, Liangchen Liu, Shichao Zhang, Yangding Li

Published: 2021, Last Modified: 15 May 2023ICME 2021Readers: Everyone

Abstract: The main challenge of cross-modal retrieval is how to efficiently realize semantic alignment and reduce the heterogeneity gap. However, existing approaches ignore the multi-grained semantic knowledge learning from different modalities. To this end, this paper proposes a novel end-to-end cross-modal representation method, termed as Multi-Graph based Hierarchical Semantic Fusion (MG-HSF). This method is an integration of multi-graph hierarchical semantic fusion with cross-modal adversarial learning, which captures fine-grained and coarse-grained semantic knowledge from cross-modal samples, and generate modalities-invariant representations in a common subspace. To evaluate the performance, extensive experiments are conducted on three benchmarks. The experimental results show that our method is superior than the state-of-the-arts.

0 Replies