Keywords: machine learning, remote sensing
Abstract: Remote sensing imagery from systems such as Sentinel provides full coverage of the Earth's surface at around 10 meter resolution. The remote sensing community has transitioned to extensive use of deep learning models based on their high performance on benchmarks such as the ISPRS Vaihingen. Convolutional models such as UNet and ResNet variations are commonly employed for remote sensing but typically only accept three channels due to their development for RGB imagery, while Sentinel satellite systems have more than 10. Recently, a number of transformer architectures have also been proposed for remote sensing, but they typically have not been extensively benchmarked and have only been employed on rather small datasets. Meanwhile, it is becoming possible to obtain dense spatial land-use labels for entire first-level administrative divisions of some countries. Scaling law observations indicate that substantially larger, multi-spectral transformer models may provide a huge leap in the performance of remote sensing models in these settings. In this work, we develop a family of multi-spectral transformer models, which we evaluate across orders of magnitude differences in model parameters to evaluate their performance and scaling effectiveness on a densely labeled imagery dataset. We develop a novel multi-spectral attention strategy and demonstrate its effectiveness through ablations. We further show in this setting that models many orders of magnitude larger than conventional architectures such as UNet lead to substantial improvements in accuracy: a UNet++ model with 23M parameters results in less than 65\% accuracy, while a multi-spectral transformer with 655M parameters yields an accuracy of over 95\% on the Biological Valuation Map of Flanders.
A link to open source code will be provided in the camera ready document.
Primary Area: applications to computer vision, audio, language, and other modalities
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.
Reciprocal Reviewing: I understand the reciprocal reviewing requirement as described on https://iclr.cc/Conferences/2025/CallForPapers. If none of the authors are registered as a reviewer, it may result in a desk rejection at the discretion of the program chairs. To request an exception, please complete this form at https://forms.gle/Huojr6VjkFxiQsUp6.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 10407
Loading