Spatial-Spectral Fusion BiFormer: A Novel Dynamic Routing Approach for Hyperspectral Image Classification
Abstract: Hyperspectral image (HSI) classification categorizes each pixel into specific land-cover types. While convolutional neural networks (CNNs) and Transformers have demonstrated commendable performance in HSI classification, they still face limitations in extracting high-resolution spatial and dense semantic spectral information from HSI data. This paper proposes a novel method with a dynamic routing mechanism called spatial-spectral fusion BiFormer (SSFBF), which leverages the Transformer’s capability to model long-range dependencies with dynamic routing attention while incorporating CNN’s local receptive fields for fine-grained semantic features in HSIs. Based on an “encoder-decoder” architecture, SSFBF employs bi-level routing attention (BRA) to capture high-resolution feature representations in the spatial domain and single-level spectral routing attention (SSRA) to extract discriminative semantic information in the spectral domain. Additionally, we develop a spatial-spectral position feedforward network (SSposFFN) with CNNs to mitigate local feature loss induced by attention, replacing the traditional MLP layer following BRA and SSRA. Furthermore, a novel spatial-spectral dynamic fusion (SSDF) module is designed to integrate features in both the spatial and spectral domains. Various experiments on four HSI datasets have demonstrated that our SSFBF approach exhibits superior versatility on HSI datasets and outperforms the state-of-the-art HSI classification methods.
Loading