SWINT-RESNet: An Improved Remote Sensing Image Segmentation Model Based on Transformer

Published: 2024, Last Modified: 15 Jan 2026IEEE Geosci. Remote. Sens. Lett. 2024EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Deep neural networks have been widely used in remote sensing image segmentation. Nowadays, artificial intelligence methods are increasingly applied to remote sensing feature classification. Although convolutional neural networks (CNNs) are widely used for image segmentation tasks, their global feature extraction with increasing image samples is insufficient. Furthermore, transformer is now being focused on computer vision. However, although transformer can capture the global information of remote sensing images, it cannot adequately model the detailed information of image changes. To comprehensively compensate for the defects of CNNs and the transformer network in feature extraction, this study proposes a semantic segmentation network with multifeature fusion (SWINT-RESNet). This network combines the transformer-extracted global and local features and those of CNNs to improve the accuracy of remote sensing image segmentation. The experiments show that the segmentation performance of SWINT-RESNet is superior for both small and medium sample remote sensing image datasets.
Loading