CenterFormer: A Center Spatial-Spectral Attention Transformer Network for Hyperspectral Image Classification

Published: 01 Jan 2025, Last Modified: 15 Apr 2025IEEE J. Sel. Top. Appl. Earth Obs. Remote. Sens. 2025EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Feature extraction is crucial for hyperspectral image classification (HSIC), and transformer-based methods have demonstrated significant potential in this field due to their exceptional global modeling capabilities. However, existing transformer-based methods use patches of fixed size and shape as input, which, while leveraging information from neighboring similar pixels to some extent, may also introduce heterogeneous pixels from nonhomogeneous regions, leading to a decrease in classification accuracy. In addition, since the goal of HSIC is to classify the center pixel, the attention calculation in these methods may focus on pixels unrelated to the center pixel, further impacting the accuracy of the classification. To address these issues, a novel transformer framework called CenterFormer is proposed, which enhances the center pixel to fully leverage the rich spatial and spectral information. Specifically, a multigranularity feature extractor is designed to effectively capture the fine-grained and coarse-grained spatial–spectral features of hyperspectral images, mitigating performance degradation caused by heterogeneous pixels. Moreover, a transformer encoder with center spatial–spectral attention is introduced, which enhances the center pixel and models global spatial–spectral information to improve classification performance. Finally, an adaptive classifier balances the classification results from different granularity branches, further enhancing the performance of CenterFormer. Comparative experiments conducted on four challenging datasets validate the model's effectiveness. Experimental results show that our model achieves an improvement in overall accuracy of up to 2.83$\% $ compared to the current state-of-the-art methods.
Loading