Interactive Enhanced Network Based on Multihead Self-Attention and Graph Convolution for Classification of Hyperspectral and LiDAR Data

Published: 2024, Last Modified: 07 Nov 2024IEEE Trans. Geosci. Remote. Sens. 2024EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: The fusion of multimodal data plays a crucial role in classification tasks. However, existing research typically mines and analyzes the individual features of each data source separately before considering how to fuse them. In contrast, our approach first constructs interactive enhanced fusion features (IEFFs) for initial fusion while considering the extraction of individual features and, finally, integrates them effectively to utilize the information from each data source more comprehensively. To this end, we propose a novel interactive enhanced network based on multihead self-attention (MSA) and graph convolution. Specifically, we extract individual features from hyperspectral image (HSI) and light detection and ranging (LiDAR) data and then construct IEFFs based on the row and column features of the central pixel. Individual features focus on the local characteristics of a single data source, while IEFFs strengthen the feature expression of the central pixel through matrix operations, integrating the complementary information of multimodal data. Subsequently, we use graph convolutional networks (GCNs) to construct graph structures for four types of features (interactive enhanced HSI features, interactive enhanced LiDAR features, HSI individual features, and LiDAR individual features), modeling the pixels as nodes and capturing spatial relationships. On this basis, we apply an MSA mechanism to mine spectral dependencies, further extracting global spectral features. Finally, we design a multimodal gated fusion module (MGFM) that effectively integrates these features through its weighting mechanism. The weight allocation is adjusted dynamically according to the characteristics of the feature, achieving optimal fusion of multimodal data. Extensive experiments on three popular HSI and LiDAR datasets verify the superior performance of our method. Our code will be available at https://github.com/haofeng0003/MSA-GCN .
Loading