Joint-Neighborhood Product Quantization for Unsupervised Cross-Modal Retrieval

Published: 01 Jan 2024, Last Modified: 11 Apr 2025VCIP 2024EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Product quantization (PQ) is a technique that transforms high-dimensional data into compact binary codes to reduce data storage and improve search efficiency. However, existing PQ methods separate the learning of modality-specific features from the learning of quantization codewords, resulting in suboptimal performance in cross-modal retrieval tasks. In this paper, we propose a joint-neighborhood product quantization (JNPQ) method to simultaneously learn modality-specific features and quantization codewords. To achieve this, we first introduce a cross-modal quantization contrastive learning module that preserves the inter-modal neighborhood of the original data and reduces the quantization error. Then, we design a self-neighbor contrastive learning module that enhances the intra-modal neighborhood within individual modalities. Extensive experiments demonstrate that JNPQ achieves state-of-the-art results in crossmodal retrieval when compared with other unsupervised crossmodal quantization methods.
Loading