Abstract: This paper proposes an innovative object detector by leveraging deep features
learned in high-level layers. Compared with features produced in earlier layers,
the deep features are better at expressing semantic and contextual information.
The proposed deep feature learning scheme shifts the focus from concrete features with details to abstract ones with semantic information. It considers not
only individual objects and local contexts but also their relationships by building
a multi-scale deep feature learning network (MDFN). MDFN efficiently detects
the objects by introducing information square and cubic inception modules into
the high-level layers, which employs parameter-sharing to enhance the computational efficiency. MDFN provides a multi-scale object detector by integrating
multi-box, multi-scale and multi-level technologies. Although MDFN employs
a simple framework with a relatively small base network (VGG-16), it achieves
better or competitive detection results than those with a macro hierarchical
structure that is either very deep or very wide for stronger ability of feature
extraction. The proposed technique is evaluated extensively on KITTI, PASCAL VOC, and COCO datasets, which achieves the best results on KITTI and
leading performance on PASCAL VOC and COCO. This study reveals that
deep features provide prominent semantic information and a variety of contextual contents, which contribute to its superior performance in detecting small
or occluded objects. In addition, the MDFN model is computationally efficient,
making a good trade-off between the accuracy and speed.
Keywords: deep feature learning, multi-scale, semantic and contextual
information, small and occluded objects.
0 Replies
Loading