GSLLM: Geospatial Knowledge Acquisition for Large Language Model

ICLR 2026 Conference Submission16137 Authors

19 Sept 2025 (modified: 08 Oct 2025)ICLR 2026 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: LLM, Geospatial encoding
TL;DR: This study proposes a novel framework to enhance LLMs with geospatial knowledge through specialized tokens, enabling improved reasoning in spatially aware applications.
Abstract: Geospatial information and its associated inferences play a critical role in numerous real-world applications. Although large language models (LLMs) acquire extensive general knowledge through large-scale pretraining, they typically lack explicit representations of geospatial data. In this study, we propose a novel framework for enabling LLMs to acquire and utilize geospatial knowledge. By introducing a set of specialized tokens designed to represent geospatial entities—such as coordinates, locations, and addresses—we effectively embed geospatial information into the model's token space. Building upon this enhanced representation, we conduct supervised fine-tuning (SFT) and reinforcement learning (RL) on a pretrained geospatially augmented model to evaluate its performance across multiple downstream tasks. Our approach demonstrates a systematic method for integrating structured geospatial knowledge into LLMs, thereby extending their reasoning capabilities to spatially informed domains.
Primary Area: unsupervised, self-supervised, semi-supervised, and supervised representation learning
Submission Number: 16137
Loading