Survey Protocol Cards for Crop Maps

Published: 01 Mar 2026, Last Modified: 01 Mar 2026ML4RS @ ICLR 2026 (Main)EveryoneRevisionsBibTeXCC BY 4.0
Abstract: Crop type maps underpin food security decisions yet their accuracy depends on training label quality, which in turn depends on survey design choices made under tight budgets. Survey planners must allocate limited resources across GPS devices, sample size, enumerator training, and verification protocols, but lack quantitative guidance on which investments yield the largest quality gains. We address this gap by modeling the full chain from survey design to downstream model degradation: survey choices map to costs, costs constrain achievable noise levels, and noise levels determine performance loss. We implement $16$ noise functions grounded in documented errors from JECAM, WorldCereal, and LSMS-ISA, and measure degradation on $2$ datasets: EuroCrops and Zambia. Our experiments reveal that label verification matters far more than GPS accuracy: crop misidentification causes up to 99\% F1 loss while 30m GPS jitter causes only 4\%. Within-dataset surrogate models achieve R$^2$=0.87, enabling millisecond what-if queries---but cross-dataset transfer shows mixed results: Spearman $\rho$=0.32--0.60 indicates rankings transfer asymmetrically, and negative R$^2$ reveals absolute degradation predictions fail across contexts. We package these findings into a programmable protocol-card and web interface that optimizes survey design given budget constraints. Code and interactive tool will be released upon feedback.
Submission Number: 43
Loading