BEV-Patch-PF: Particle Filtering with BEV-Aerial Feature Matching for Off-Road Geo-Localization

Published: 30 May 2025, Last Modified: 09 Jun 2025RSS 2025 Workshop ROAR PosterEveryoneRevisionsBibTeXCC BY 4.0
Keywords: geo localization, cross view, off-road, navigation
Abstract: Accurate localization of ground robots using aerial imagery is essential for off-road navigation and planning, especially in GPS-denied environments. However, this task remains challenging due to large viewpoint differences, scarce distinctive features, and high environmental variability. Most existing approaches typically localize each frame independently, either by retrieving global descriptors or by aligning ground and aerial features in a shared spatial representation, making them susceptible to ambiguity and multi-modal pose estimates. While sequential localization can reduce such uncertainty, existing per-frame methods incur trade-offs between accuracy, memory, and computational cost, limiting their effectiveness in a sequential setting. We propose BEV-Patch-PF, a GPS-free sequential localization system that integrates a particle filter with a learned bird’s-eye-view (BEV) observation model. For each particle pose hypothesis, a single aerial feature patch is cropped and its likelihood is computed by comparing it against the BEV feature derived from the on-board view. Ground features are extracted using a visual foundation model, and fused with aerial features via cross-attention to emphasize salient off-road regions Experiments on real-world off-road routes from the TartanDrive 2.0 dataset demonstrate that BEV-Patch-PF outperforms stereo visual odometry and a retrieval-based baseline in trajectory accuracy across both seen and unseen environments, highlighting its robustness and generalization.
Submission Number: 9
Loading