A Comparative Study on Vision Transformers in Remote Sensing Building Extraction

Georgios-Fotios Angelis, Armando Domi, Alexandros Zamichos, Maria Tsourma, Ioannis Manakos, Anastasios Drosou, Dimitrios Tzovaras

Published: 2023, Last Modified: 04 Mar 2025VISIGRAPP (3: IVAPP) 2023EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Data visualization has received great attention in the last few years and gives valuable assets for better understanding and extracting information from data. More specifically, in Geospatial data, visualization includes information about the location, the geometric shape of elements, and the exact position of elements that can lead in enhances downstream applications such as damage detection, building energy consumption estimation, urban planning and change detection. Extracting building footprints from remote sensing (RS) imagery can help in visualizing damaged buildings and separate them form terrestrial objects. Considering this, the current manuscript provides a detailed comparison and a new benchmark for remote sensing building extraction. Experiments are conducted in three publicly available datasets aiming to evaluate accuracy and performance of the compared Transformer-based architectures. MiTNet and other five transformers architectures are introduced, namely DeepViTU