Correcting Geospatial Data Displacement with Foundation Vision Models

Published: 01 Mar 2026, Last Modified: 05 Apr 2026ML4RS @ ICLR 2026 (Main)EveryoneRevisionsBibTeXCC BY 4.0
TL;DR: We use foundation vision models to correct GPS-displaced annotations by matching them to visually similar reference examples, improving dataset quality without requiring any retraining.
Abstract: Geospatial point annotations collected during field surveys often suffer from positional displacement due to GPS inaccuracy and environmental constraints, limiting their utility for downstream applications. Traditional alignment methods rely on multi-temporal imagery or task-specific training, restricting their practical applicability. We propose a simple preprocessing pipeline that leverages foundation vision models to correct displaced annotations through semantic similarity matching before downstream analysis or model training. A semantic reference is constructed from a small set of annotated examples of the target class, and for each displaced point, we define a search region to identify the location with highest similarity to the reference set using embeddings from a feature extractor. We evaluate our method on a forestry dataset from the Amazon rainforest containing annotations for over 50 tree species. Linear probing experiments demonstrate that models trained on corrected annotations outperform those trained on original displaced data, and qualitative analysis shows that corrections consistently move points from background regions toward the target class. By requiring only a small set of reference examples and no retraining, our method provides a practical preprocessing step for improving geospatial annotation quality in field-based surveys.
Submission Number: 57
Loading