MFA-Encoder: A Multilevel Feature-Aware Hybrid Encoder for Object Detection in Optical Remote Sensing Images under DETR Architecture
Abstract: Remote sensing images often present complex backgrounds, introducing significant noise that hinders feature extraction and object detection, particularly for small objects. This paper introduces the Multilevel Feature-Aware Hybrid Encoder (MFA-Encoder), designed to efficiently extract and fuse features by selectively employing different attention modules at various levels within the encoder. In this hybrid encoder, we leverage self-attention, cross-channel cross-attention, and multiscale compound attention instead of applying self-attention mechanisms to features of all scales. Expand the dimensions of high-level features, reinforce the representation of low-level features. Comprehensive experiments demonstrate that our model achieves best performance with an AP of 65.5% on the DIOR dataset.
External IDs:dblp:conf/igarss/LiBCKL24
Loading