TL;DR: We extend GeoBench benchmark to evaluate cross-band generalization of remote sensing foundation models
Abstract: The number and diversity of remote sensing satellites grows over time, yet the vast majority of labeled data comes from older satellites. As foundation models for Earth observation scale up, the cost of (re-)training to support new satellites grows too, making spectral generalization critical. We introduce GeoCrossBench, an extension of GeoBench with a new evaluation protocol: it tests in-distribution performance; generalization to satellites with no band overlap; and generalization to satellites with additional bands. We also develop a self-supervised extension of ChannelViT, ChiViT, to improve cross-satellite performance. We show that while even the best RS foundation models do not outperform general-purpose models like DINOv3 in our benchmark, our ChiViT outperforms the runner-up DINOv3. Finally, we show that performance of all tested models drops by 5-25\% when given additional bands during test time, highlighting that current architectures are not yet future-proof.
Submission Number: 64
Loading