Abstract: Remote sensing scene classification aims to assign automatically each aerial image a specific sematic label. In this letter, we propose a new method, called self-attention-based deep feature fusion (SAFF), to aggregate deep layer features and emphasize the weights of the complex objects of remote sensing scene images for remote sensing scene classification. First, the pretrained convolutional neural network (CNN) model is applied to extract the abstract multilayer feature maps from the original aerial imagery. Then, a nonparametric self-attention layer is proposed for spatial-wise and channel-wise weightings, which enhances the effects of the spatial responses of the representative objects and uses the infrequently occurring features more sufficiently. Thus, it can extract more discriminative features. Finally, the aggregated features are fed into a support vector machine (SVM) for classification. The proposed method is experimented on several data sets, and the results prove the effectiveness and efficiency of the scheme for remote sensing scene classification.
0 Replies
Loading