Cross-modal hashing retrieval with compatible triplet representation

Zhifeng Hao; Yaochu Jin; Xueming Yan; Chuyue Wang; Shangshang Yang; Hong Ge

Cross-modal hashing retrieval with compatible triplet representation

Zhifeng Hao, Yaochu Jin, Xueming Yan, Chuyue Wang, Shangshang Yang, Hong Ge

Published: 01 Jan 2024, Last Modified: 18 May 2025Neurocomputing 2024EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Cross-modal hashing retrieval has emerged as a promising approach due to its advantages in storage efficiency and query speed for handling diverse multimodal data. However, existing cross-modal hashing retrieval methods often oversimplify similarity by solely considering identical labels across modalities and are sensitive to noise in the original multimodal data. To tackle this challenge, we propose a cross-modal hashing retrieval approach with compatible triplet representation. In the proposed approach, we integrate the essential feature representations and semantic information from text and images into their corresponding multi-label feature representations, and introduce a fusion attention module to extract text and image modalities with channel and spatial attention features, respectively, thereby enhancing compatible triplet-based semantic information in cross-modal hashing learning. Comprehensive experiments demonstrate the superiority of the proposed approach in retrieval accuracy compared to state-of-the-art methods on three public datasets.

Loading