Keywords: Face Detection, Neural Architecture Search, Network Expressivity
Abstract: Face detection (FD) has achieved remarkable success over the past few years, yet,
these leaps often arrive when consuming enormous computation costs. Moreover,
when considering a realistic situation, i.e., building a lightweight face detector
under a computation-scarce scenario, such heavy computation cost limits the application
of the face detector. To remedy this, several pioneering works design
tiny face detectors through off-the-shelf neural architecture search (NAS) technologies,
which are usually applied to the classification task. Thus, the searched
architectures are sub-optimal for the face detection task since some design criteria
between detection and classification task are different. As a representative, the
face detection backbone design needs to guarantee the stage-level detection ability
while it is not required for the classification backbone. Furthermore, the detection
backbone consumes a vast body of inference budgets in the whole detection framework.
Considering the intrinsic design requirement and the virtual importance role
of the face detection backbone, we thus ask a critical question: How to employ
NAS to search FD-friendly backbone architecture? To cope with this question,
we propose a distribution-dependent stage-aware ranking score (DDSAR-Score)
to explicitly characterize the stage-level expressivity and identify the individual
importance of each stage, thus satisfying the aforementioned design criterion of
the FD backbone. Based on our proposed DDSAR-Score, we conduct comprehensive
experiments on the challenging Wider Face benchmark dataset and achieve
dominant performance across a wide range of compute regimes. In particular,
compared to the tiniest face detector SCRFD-0.5GF, our method is +2.5 % better
in Average Precision (AP) score when using the same amount of FLOPs. The
code is avaliable at https://github.com/ly19965/FaceMaas/tree/master/face_project/face_detection/DamoFD.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics
Submission Guidelines: Yes
Please Choose The Closest Area That Your Submission Falls Into: Applications (eg, speech processing, computer vision, NLP)
TL;DR: We propose a novel DDSAR score to characterize stage-wise detection ability, based on which, we employ off-the-shelf NAS technology to search FD-friendly backbone architectures.
10 Replies
Loading