How Do Low-Level Image Features Affect CNN-Based Face Detector Accuracy?

Ahana Roy Choudhury, K. S. Krishnapriya, Chase E. Vaughan, Radu Paul Mihail

Published: 01 Jan 2023, Last Modified: 18 Mar 2024SMC 2023Readers: Everyone

Abstract: Face detectors are a subset of object detectors that output, at a minimum, a set of locations in an image if and where human faces are present. Face detection is challenging, in part, due to low variance in the structural content of frontal-view faces (i.e., most faces have two eyes, a nose and a mouth) and high variance in visual appearance. This aspect of the domain skews detectors to higher false positive rates as a consequence of many patches of imagery containing features spatially consistent with frontal-view faces. In this study, we evaluate the performance of three state-of-the-art face detectors (BlazeFace, MTCNN, and SCRFD) on frontal-view face imagery in a novel human-labeled dataset of 64,104 images with reliable ground truth. We show evidence that modern CNN-based models rely heavily on low-level image features, in spite of their powerful capability to learn complex, discriminatory visual features and concepts. We do this by altering the spectral and color content of frontal-view face images. To gain a better understanding of detector failures, we apply the Deep Dream technique to enhance image features that lead models to false positives.

0 Replies