Keywords: object detection, object recognition, deep learning
Abstract: Object detection remains one of the most notorious open problems in computer
vision. Despite large strides in accuracy and speed in recent years, modern object
detectors have started to saturate on popular benchmarks. How far can we push
the detection accuracy with the current deep learning tools and tricks? In this
work, by employing two popular state-of-the-art object detection benchmarks,
MMDetection and Detectron2, and analyzing more than 15 models over 4
large-scale datasets, we systematically determine the upper bound in AP, which is
91.6% on PASCAL VOC (test2007), 78.2% on MS COCO (val2017), and 58.9%
on OpenImages (V4 validation set), regardless of the IOU. These numbers are
much higher than the mAP of the best model (e.g., 58% on MS COCO according
to the most recent results). Interestingly, the gap seems to be almost closed at
IOU=0.5. We also analyze the role of context in object recognition and detection
and find that the canonical object size leads to the best recognition accuracy.
Finally, we carefully characterize the sources of errors in deep object detectors and
find that classification error (confusion with other classes and misses) explains the
largest fraction of errors and weighs more than localization error. Further, models
frequently miss small objects, more often than medium and large ones. Our work
taps into the tight relationship between object recognition and detection and offers
insights to build better object detectors. Similar analyses can also be conducted
for other tasks in computer vision such as for instance segmentation and object
tracking. The code is available at [TBA].
One-sentence Summary: We determine the empirical upper bound in mean average precision in object detection and show that models are performing much lower than this upper bound.
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics
Reviewed Version (pdf): https://openreview.net/references/pdf?id=c4TmqAXpzW
5 Replies
Loading