Abstract: Cross-modal retrieval has been an emerging topic over the last years, as modern applications have to efficiently search for multimedia documents with different modalities. In this study, we propose a cross-modal hashing method by following a cluster-based joint matrix factorization strategy. Our method first builds clusters for each modality separately and then generates a cross-modal cluster representation for each document. We formulate a joint matrix factorization process with the constraint that pushes the documents' representations of the different modalities and the cross-modal cluster representations into a common consensus matrix. In doing so, we capture the inter-modality, intra-modality and cluster-based similarities in a unified latent space. Finally, we present an efficient way to generate the hash codes using the maximum entropy principle and compute the binary codes for external queries. In our experiments with two publicly available data sets, we show that the proposed method outperforms state-of-the-art hashing methods for different cross-modal retrieval tasks.
0 Replies
Loading