CVM-Net: Cross-View Matching Network for Image-Based Ground-to-Aerial Geo-Localization

Sixing Hu, Mengdan Feng, Rang Nguyen, Gim Hee Lee

18 Nov 2022OpenReview Archive Direct UploadReaders: Everyone

Abstract: The problem of localization on a geo-referenced aerial/satellite map given a query ground view image re- mains challenging due to the drastic change in viewpoint that causes traditional image descriptors based matching to fail. We leverage on the recent success of deep learn- ing to propose the CVM-Net for the cross-view image-based ground-to-aerial geo-localization task. Specifically, our network is based on the Siamese architecture to do metric learning for the matching task. We first use the fully con- volutional layers to extract local image features, which are then encoded into global image descriptors using the pow- erful NetVLAD. As part of the training procedure, we also introduce a simple yet effective weighted soft margin rank- ing loss function that not only speeds up the training con- vergence but also improves the final matching accuracy. Ex- perimental results show that our proposed network signifi- cantly outperforms the state-of-the-art approaches on two existing benchmarking datasets. Our code and models are publicly available on the project website1.

0 Replies