Abstract: Aiming to estimate the location, a common strategy of Visual Place Recognition (VPR) involves utilizing global retrieval to get top-k candidates first and performing local feature matching in candidates for reranking. Although local reranking methods bring performance gains, they need a lot of computational overhead. To narrow the performance gap between global retrieval and local reranking methods with little cost, one method is to rerank candidates with global features. However, previous works only utilized the information from positive samples in candidates, ignoring the fact that negative samples can also provide useful information. To this end, we propose RankTuning, a method that aggregates all the information from candidates using global features for reranking. Specifically, we design a cross-image interaction module that allows all candidates to interact with others to enhance the discriminative power of features. Furthermore, to drive the training of this module, we propose Generalized Recall loss to handle hard samples with a better gradient strategy. Experimental results demonstrate that our method can be easily inserted into existing architectures and achieve state-of-the-art performance. Meanwhile, our method does not require additional storage overhead, and the matching latency is only 6.3% of that of the current fastest local reranking method. The code is released at https://github.com/LKELN/RankTuning.git
External IDs:dblp:journals/tits/LiuZFSXZ25
Loading