M2-Net: A Multi-scale Multi-level Feature Enhanced Network for Object Detection in Optical Remote Sensing Images

Xinhai Ye, Fengchao Xiong, Jianfeng Lu, Haifeng Zhao, Jun Zhou

2020 (modified: 15 Nov 2022)DICTA 2020Readers: Everyone

Abstract: Object detection in remote sensing images is a challenging task due to diversified orientation, complex background, dense distribution and scale variation of objects. In this paper, we tackle this problem by proposing a novel multi-scale multi-level feature enhanced network ( M <sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">2</sup> -Net) that integrates a Feature Map Enhancement (FME) module and a Feature Fusion Block (FFB) into Rotational RetinaNet. The FME module aims to enhance the weak features by factorizing the convolutional operation into two similar branches instead of one single branch, which helps to broaden receptive field with less parameters. This module is embedded into different layers in the backbone network to capture multi-scale semantics and location information for detection. The FFB module is used to shorten the information propagation path between low-level high-resolution features in shallow layers and high-level semantic features in deep layers, facilitating more effective feature fusion and object detection especially those with small sizes. Experimental results on three benchmark datasets show that our method not only outperforms many one-stage detectors but also achieves competitive accuracy with lower time cost than two-stage detectors.

0 Replies