Progressively Refined Face Detection Through Semantics-Enriched Representation Learning

Zhihang Li, Xu Tang, Xiang Wu, Jingtuo Liu, Ran He

2020 (modified: 18 Sept 2021)IEEE Trans. Inf. Forensics Secur. 2020Readers: Everyone

Abstract: Feature pyramids aim to learn multi-scale representations for detecting faces over various scales. However, they often lack adequate context over different scales, especially when there are many tiny faces in the wild. In this paper, we propose an attention-guided semantically enriched feature aggregation framework to learn a feature pyramid with rich semantics at all scales for face detection. Specifically, high-level abstract features are directly integrated into low-level representations by skip connections to retain as much semantic as possible. In addition, an attention mechanism is employed as a gate to emphasize relevant features and suppress useless features during feature fusion. Inspired by human visual perception of tiny faces, we specially design a deep progressive refined loss (DPRL) to effectively facilitate feature learning. According to the above principles, we design and investigate various feature pyramid frameworks through extensive experiments. Finally, two typical structures named Centralized Attention Feature (CAF) and Distributed Attention Feature (DAF) are proposed for face detection, which are in-place and end-to-end trainable. Extensive experiments across different aggregation architectures on four challenging face detection benchmarks demonstrate the superiority of our framework over state-of-the-art methods.

0 Replies