Abstract: Tf-idf weighting scheme is adopted by state-of-the-art object retrieval systems to reflect the difference in discriminability between visual words. However, we argue it is only suboptimal by noting that tf-idf weighting scheme does not take quantization error into account and exploit word correlation. We view tf-idf weights as an example of diagonal Mahalanobis-type similarity matrix and generalize it into a sparse one by selectively activating off-diagonal elements. Our goal is to separate similarity of relevant images from that of irrelevant ones by a safe margin. We satisfy such similarity constraints by learning an optimal similarity metric from labeled data. An effective scheme is developed to collect training data with an emphasis on cases where the tf-idf weights violates the relative relevance constraints. Experimental results on benchmark datasets indicate the learnt similarity metric consistently and significantly outperforms the tf-idf weighting scheme.
Loading