Detecting Multi-Scale Faces Using Attention-Based Feature Fusion and Smoothed Context Enhancement

Published: 01 Jan 2020, Last Modified: 04 Mar 2025IEEE Trans. Biom. Behav. Identity Sci. 2020EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Though tremendous strides have been made in face detection, face detection remains a challenging problem due to scale variance. In this paper, we propose a smoothed attention network for performing scale-invariant face detection by taking advantage of feature fusion and context enhancement, which is dubbed SANet. To reduce the noise in the fused features at different levels, an Attention-guided Feature Fusion Module (AFFM) is designed. In addition, an exhaustive analysis of the role of attention mechanism on performance is conducted, which considers channel-wise, spatial-wise attentions and their combinations. To enrich the contextual information by using dilated convolution and avoid the gridding artifacts problem produced by dilated convolution, we propose a Smoothed Context Enhancement Module (SCEM). Our method achieves state-of-the-art results on the UFDD dataset and comparable performance on the WIDER FACE, FDDB, PASCAL Faces, and AFW datasets.
Loading