Abstract: Deep learning (DL) technique is widely applied in remote sensing (RS) applications because of its outstanding nonlinear feature extraction ability. However, with regard to the issues of large-scale and very high-resolution (VHR) land cover classification, multi-object distributions and clear appearance with large intraclass difference become challenges for refined pixelwise land cover mapping. Focusing on these problems, the letter proposed a novel encoding-to-decoding method called the full receptive field (RF) network (FRF-Net) based on two types of attention mechanism. In the FRF-Net, ResNet-101 is used as the basic backbone. Then, the ensemble feature is generated by encoding the high-level features based on the self-attention mechanism which could achieve full RF to capture long-range semantic. Next, the encoding result is decoded by the fusion attention mechanism combined with the low-level feature to produce a fusion feature which contains a refined semantic description for accurate land cover mapping. Extensive experiments based on the GID and ISPRS data sets proved that the proposed network outperforms the state-of-the-art methods. The FRF-Net achieved 66.71% and 64.17% of the mean of classwise Intersection over Union (mIOU) with smaller computation cost on ISPRS and GID, respectively.
0 Replies
Loading