Multi-Source Patch Feature Fusion With Neighborhood Flash Attention Transformer for Pixel-Level Vehicle and Road Recognition in Hyperspectral Image
Abstract: Hyperspectral imaging can capture the spectrum of each pixel in an image across various wavelengths, providing unparalleled opportunities for precise detection, classification, and analysis of transportation infrastructure. However, traditional methods often struggle with the curse of dimensionality, inter-class variability, and the spectral-spatial trade-off inherent in hyperspectral data. To address these challenges, we introduce a novel Multi-Source Patch Feature fusion based Neighborhood Flash Attention Transformer (MSPF-NFAT) for pixel-level vehicle and road recognition in hyperspectral images (HSIs). Our methodology hinges on the insight that the integration of complementary features from multiple sources and scales can significantly enhance classification performance. Specifically, the MSPF is designed to aggregate and harmonize features extracted from both spectral and spatial dimensions, as well as from different contextual scales within the image. This fusion process ensures a richer representation of the data, capturing both the fine-grained details and the broader contextual information essential for accurate classification. Building upon this enriched feature set, we employ the NFAT, a state-of-the-art attention mechanism that focuses on capturing local spatial relationships while efficiently scaling to accommodate the high-resolution characteristics of hyperspectral data. In addition, extensive experimental results on four widely used HSIs datasets show that our newly proposed method provides superior performance compared to other state-of-the-art methods.
Loading