SPGFormer: Structure Perception Graph Transformer With Laplacian Position Encoding for Hyperspectral Image Classification
Abstract: With the integration of graph structure representation and self-attention mechanism, the graph Transformer (GT) demonstrates remarkable effectiveness in hyperspectral image (HSI) classification by simultaneously modeling complex topological structures and capturing long-range spectral–spatial dependencies. Current GT models struggle to effectively integrate node features with position information while balancing global positional context and local structural relationships. To address these challenges, this article proposes a structure perception GT (SPGFormer) with Laplacian position encoding (PE) for HSI classification. Specifically, SPGFormer introduces a dual-branch interactive Transformer (DBIT) module that processes node spectral features and Laplacian position information in parallel by the node feature Transformer (NFT) branch and the structure perception position Transformer (SPPT) branch, respectively. The NFT branch employs a Transformer encoder to model spectral characteristics, while the SPPT branch leverages graph Laplacian eigenvectors to capture global position information. In addition, a structure perception multihead self-attention (SPMHSA) mechanism is proposed to integrate the local topological structural information into the SPPT branch. Moreover, to facilitate effective interaction between these complementary feature spaces, a bidirectional cross-attention module (BCAM) is proposed to enable bidirectional communication between these two branches. Comprehensive experimental results demonstrate that SPGFormer outperforms existing state-of-the-art (SOTA) methods in multiple metrics, including overall accuracy (OA), average accuracy (AA), and Kappa coefficient, with improvements in OA of 2.04%, 2.39%, 1.10%, and 2.27%, respectively, on four benchmark HSI datasets.
External IDs:doi:10.1109/tgrs.2025.3599140
Loading