S2GFormer: A Transformer and Graph Convolution Combing Framework for Hyperspectral Image Classification

Shiqi Huang, Yao Ding, Zhili Zhang, Aitao Yang, Shujun Yang, Yaoming Cai, Weiwei Cai

Published: 29 Oct 2024, Last Modified: 13 Nov 2024OpenReview Archive Direct UploadEveryoneCC BY 4.0

Abstract: Transformer-based methods have a great ability to model non-local interactions among spectral and spatial information, while the local features are easily ignored. Graph convolutional neural networks (GCNs) tend to do well in exploiting neighborhood vertex interactions based on their unique aggregation mechanism, while the ability to extract global information is limited. In this paper, we study to comprehensively utilize the advantages of transformer and graph convolution by combing the two structures into a unified Transformer (Graphormer) to construct both local and global interactions for HSI classification, and spatial-spectral features enhanced Graphormer framework (S 2 GFormer) is proposed. Specifically, a Follow Patch mechanism is first proposed to transform the pixel in HSI to patches while preserving the local spatial features and reducing the computation cost. Moreover, a patch-wise spectral embedding block is designed to extract the spectral features of the patch, in which a neighborhood convolution is inserted for comprehensive spectral information extraction. Finally, a multi-layer Graphormer encoder module is proposed to extract the representative spatial-spectral features from the patch for HSI classification. In our network, we jointly integrate the three aforementioned parts into a unified network, and each component benefits the other. The experimental results demonstrate its suitability for HSI classification when compared with other state-of-the-art classifiers, particularly in scenarios with very limited labeled samples. The code of S 2 GFormer will be made publicly available at https://github.com/DY-HYX.