3-D Point-Guided Aerial–Ground Image Matching for Robust Multiview Reconstruction

Published: 01 Jan 2025, Last Modified: 27 May 2026IEEE Journal of Selected Topics in Applied Earth Observations and Remote SensingEveryoneRevisionsCC BY-SA 4.0
Abstract: Matching and aligning ground and aerial images are critical for enhancing the accuracy and completeness of 3-D reconstruction. However, significant differences in perspective and radiometric characteristics between aerial and ground images make this task highly challenging. Existing mesh-based approaches often overlook the geometric properties of 3-D points in the structure-from-motion model and suffer from limited track length. To address these issues, we propose a 3-D point-guided matching framework that leverages reconstructed 3-D points to guide the matching between aerial and ground images. Our method introduces a 3-D point-guided transformer to encode point coordinates into embeddings and integrate them into image features, enabling effective correspondence between synthetic aerial views and real ground images. In addition, we design a Transformer-based regression module to refine matching positions within local windows, improving the accuracy of aerial–ground correspondences. Our pipeline reduces matching errors, enables long-track correspondences, and facilitates robust multiview integration. Furthermore, we construct two challenging aerial–ground datasets to validate the effectiveness of our method in city-scale 3-D reconstruction. Extensive experiments on public benchmarks and our datasets demonstrate that our framework significantly outperforms state-of-the-art methods in both matching accuracy and reconstruction quality.
Loading