Towards LLM Agents for Earth Observation

Chia Hsiang Kao; Wenting Zhao; Shreelekha Revankar; Samuel Speas; Snehal Bhagat; Rajeev Datta; Cheng Perng Phoo; Utkarsh Mall; Carl Vondrick; Kavita Bala; Bharath Hariharan

Towards LLM Agents for Earth Observation

Chia Hsiang Kao, Wenting Zhao, Shreelekha Revankar, Samuel Speas, Snehal Bhagat, Rajeev Datta, Cheng Perng Phoo, Utkarsh Mall, Carl Vondrick, Kavita Bala, Bharath Hariharan

Published: 10 Jun 2025, Last Modified: 06 Jan 2026TerraBytes 2025 withoutproceedingsEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Earth Observation, Benchmark

TL;DR: We introduce UnivEarth, a benchmark of 140 questions spanning 13 Earth Science topics and data from 17 remote sensing instruments.

Abstract: Earth Observation (EO) provides critical planetary data for environmental monitoring, disaster management, climate science, and other scientific domains. Here we ask: *Are AI agents ready for reliable Earth Observation*? We introduce **UnivEARTH**, a benchmark of 140 yes/no questions from NASA Earth Observatory articles across 13 topics and 17 satellite sensors. Using Google Earth Engine API as a tool, LLM agents can only achieve an accuracy of 33% because the code fails to run over 58% of the time. Taken together, our findings identify significant challenges to be solved before AI agents can automate Earth observation and suggest paths forward.

Submission Number: 2

Loading