Abstract: Single-stage object detectors have recently gained popularity due to their combined advantage of high detection
accuracy and real-time speed. However, while promising
results have been achieved by these detectors on standardsized objects, their performance on small objects is far from
satisfactory. To detect very small/large objects, classical
pyramid representation can be exploited, where an image
pyramid is used to build a feature pyramid (featurized image
pyramid), enabling detection across a range of scales. Existing single-stage detectors avoid such a featurized image
pyramid representation due to its memory and time complexity. In this paper, we introduce a light-weight architecture to efficiently produce featurized image pyramid in a
single-stage detection framework. The resulting multi-scale
features are then injected into the prediction layers of the
detector using an attention module. The performance of our
detector is validated on two benchmarks: PASCAL VOC
and MS COCO. For a 300×300 input, our detector operates at 111 frames per second (FPS) on a Titan X GPU,
providing state-of-the-art detection accuracy on PASCAL
VOC 2007 testset. On the MS COCO testset, our detector achieves state-of-the-art results surpassing all existing
single-stage methods in the case of single-scale inference.
0 Replies
Loading