Abstract: Cross-modal retrieval is a classic research topic in multimedia information retrieval. The traditional approaches study the problem as a pairwise similarity function problem. In this paper, we consider this problem from a new perspective as a listwise ranking problem and propose a general cross-modal ranking algorithm to optimize the listwise ranking loss with a low rank embedding, which we call Latent Semantic Cross-Modal Ranking (LSCMR). The latent low-rank embedding space is discriminatively learned by structural large margin learning to optimize for certain ranking criteria directly. We evaluate LSCMR on the Wikipedia and NUS-WIDE dataset. Experimental results show that this method obtains significant improvements over the state-of-the-art methods.
0 Replies
Loading