Keywords: Object Detection, Detection Backbone, Neural Architecture Search, Zero-Shot NAS
Abstract: In object detection models, the detection backbone consumes more than half of the overall inference cost. Recent researches attempt to reduce this cost by optimizing the backbone architecture with the help of Neural Architecture Search (NAS). However, existing NAS methods for object detection require hundreds to thousands of GPU hours of searching, making them impractical in fast-paced research and development. In this work, we propose a novel zero-shot NAS method to address this issue. The proposed method, named ZenDet, automatically designs efficient detection backbones without training network parameters, reducing the architecture design cost to nearly zero yet delivering the state-of-the-art (SOTA) performance. Under the hood, ZenDet maximizes the differential entropy of detection backbones, leading to a better feature extractor for object detection under the same computational budgets. After merely one GPU day of fully automatic design, ZenDet innovates SOTA detection backbones on multiple detection benchmark datasets with little human intervention. Comparing to ResNet-50 backbone, ZenDet is $+2.0\%$ better in mAP when using the same amount of FLOPs/parameters and is $1.54$ times faster on NVIDIA V100 at the same mAP. Code and pre-trained models will be released after publication.
One-sentence Summary: This work proposes a novel zero-shot NAS method ZenDet for object detection which designs SOTA detection backbones without parameter training.
17 Replies
Loading