Zero-Shot Evaluation of Commercial Software and State-of-the-Art FER Models on Standardized Datasets
Abstract: Commercial facial expression recognition tools, such as FaceReader 9©, are often used as off-the-shelf solutions in applied research and industry. However, their real-world generalization capacity, especially in dynamic and unconstrained environments, is rarely scrutinized. This study evaluates the zero-shot performance of FaceReader 9 on two standardized dynamic datasets, RAVDESS and CREMA-D, and compares its results with several publicly available state-of-the-art FER models. The results reveal that FaceReader 9 is significantly outperformed across all metrics, with accuracy levels close to random chance on the more challenging dataset. In contrast, even static models trained on general-purpose datasets perform markedly better, and a dynamic model specifically trained on the evaluation datasets achieves a substantial performance gain. These findings emphasize the limitations of commercial FER systems in dynamic contexts and highlight the value of task-specific training and temporal modeling for robust emotion recognition.
External IDs:doi:10.1007/978-3-032-11317-7_4
Loading