Abstract: Highlights • The novel reconstruction regularization term can preserve the essential information. • Low-rank constraint can well explore the correlation among samples. • An efficient algorithm is presented to optimize the problem with convergence guarantee. • RRLSL can be applied to both supervised and unsupervised situations. • Superior performance is achieved with lowest computational complexity. Abstract With the rapid increase of multi-modal data through the internet, cross-modal matching or retrieval has received much attention recently. It aims to use one type of data as query and retrieve results from the database of another type. For this task, the most popular approach is the latent subspace learning, which learns a shared subspace for multi-modal data, so that we can efficiently measure cross-modal similarity. Instead of adopting traditional regularization terms, we hope that the latent representation could recover the multi-modal information, which works as a reconstruction regularization term. Besides, we assume that different view features for samples of the same category share the same representation in the latent space. Since the number of classes is generally smaller than the number of samples and the feature dimension, therefore the latent feature matrix of training instances should be low-rank. We try to learn the optimal latent representation, and propose a reconstruction based term to recover original multi-modal data and a low-rank term to regularize the learning of subspace. Our method can deal with both supervised and unsupervised cross-modal retrieval tasks. For those situations where the semantic labels are not easy to obtain, our proposed method can also work very well. We propose an efficient algorithm to optimize our framework. To evaluate the performance of our method, we conduct extensive experiments on various datasets. The experimental results show that our proposed method is very efficient and outperforms the state-of-the-art subspace learning approaches.
0 Replies
Loading