Map-RAG: Enhancing LLM-Based Reasoning for Geo-Localization via Map-Grounded Retrieval and Self-Consistency

Map-RAG: Enhancing LLM-Based Reasoning for Geo-Localization via Map-Grounded Retrieval and Self-Consistency

Agents4Science 2025 Conference Submission69 Authors

02 Sept 2025 (modified: 08 Oct 2025)Submitted to Agents4ScienceEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Large Language Models;

TL;DR: Map-RAG is a retrieval-augmented LLM framework that grounds multi-chain reasoning in map data to improve the accuracy and interpretability of visual geo-localization in GNSS-denied environments.

Abstract: Large Language Models (LLMs) have recently demonstrated promising reasoning abilities in multimodal tasks, yet their performance in fine-grained geo-localization remains limited due to hallucinations, insufficient spatial priors, and a lack of structured evidence integration. This paper introduces Map-RAG, a reasoning-augmented framework for visual geo-localization in which an LLM iteratively retrieves structured map knowledge and refines its hypotheses through a self-consistency mechanism. Unlike prior approaches that rely solely on embedding similarity or chain-of-thought prompting, Map-RAG integrates three key modules: (1) a visual-to-text translator that extracts geographic cues (e.g., road topology, building style, language on signs) from input images; (2) a map-grounded retrieval agent that queries OpenStreetMap and local gazetteers for candidate regions; and (3) a multi-chain self-consistency verifier that scores and reconciles multiple reasoning trajectories based on semantic-map alignment and geometric feasibility. Experiments on CVUSA, VIGOR, and MSLS benchmarks demonstrate that Map-RAG achieves significant improvements over baselines in Recall@1 (+6–12%) and median localization error (−20–35%), while producing interpretable reasoning traces. Ablation studies confirm that map-grounded retrieval reduces hallucination, and that multi-chain self-consistency enhances robustness under challenging conditions such as seasonal changes and partial occlusions. This work provides evidence that LLMs, when equipped with structured geographic knowledge and verification mechanisms, can serve as explainable geo-localizers in GNSS-denied environments. Beyond performance gains, Map-RAG contributes an auditable reasoning pipeline, aligning with the broader goal of transparent and reproducible AI for scientific discovery.

Submission Number: 69

Loading