Abstract: This paper proposes a novel method for cross-modal retrieval named Pairwise-Listwise \textbf{ranking} (PL-ranking) based on the low-rank optimization framework. Motivated by the fact that optimizing the top of ranking is more applicable in practice, we focus on improving the precision at the top of ranked list for a given sample and learning a low-dimensional common subspace for multi-modal data. Concretely, there are three constraints in PL-ranking. First, we use a pairwise ranking loss constraint to optimize the top of ranking. Then, considering that the pairwise ranking loss constraint ignores class information, we further adopt a listwise constraint to minimize the intra-neighbors variance and maximize the inter-neighbors separability. By this way, class information is preserved while the number of iterations is reduced. Finally, low-rank based regularization is applied to exploit the correlations between features and labels so that the relevance between the different modalities can be enhanced after mapping them into the common subspace. We design an efficient low-rank stochastic subgradient descent method to solve the proposed optimization problem. The experimental results show that the average MAP scores of PL-ranking are improved 5.1%, 9.2%, 4.7% and 4.8% than those of the state-of-the-art methods on the Wiki, Flickr, Pascal and NUS-WIDE datasets, respectively.
Loading