GeoFT: Fine-tuning Foundation Models for Automated OSINT Geolocation

Published: 06 Mar 2025, Last Modified: 06 Mar 2025ICLR 2025 FM-Wild WorkshopEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Foundation Models, Vision-Language Models, Fine-Tuning, National Security, OSINT
Abstract: Open source intelligence (OSINT) investigators face the challenge of verifying the location of media shared online. Traditional geolocation requires manual effort and cannot scale with the ever-growing volume of images and videos shared on social media. We present GeoFT, a fine-tuned version of GeoCLIP specifically optimized for geolocation in Russia and Ukraine. By focusing on street-level imagery and leveraging community-validated datasets, our model achieves significantly improved accuracy compared to existing solutions. On our test set, GeoFT reduces the average error from 3,520km to 2,150km while maintaining interpretable confidence scores. We demonstrate the model's potential for aiding OSINT investigations and discuss pathways for deployment in real-world applications.
Submission Number: 140
Loading