SWP: A Stage-Weighted Pooling Based Convolutional Neural Networks for Facial Expression Recognition in the Wild

Kuan-Hsien Liu, Wen-Ren Liu, Tsung-Jung Liu

Published: 2024, Last Modified: 11 Jul 2025MLSP 2024EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: One of the primary difficulties in facial expression recognition is the non-uniform distribution of samples among expression classes within datasets like RAF-DB, FERPlus, and AffectNet. This imbalance poses a challenge for most deep learning models, leading to inflated trainingc osts a nd prolonged training times due to the sheer volume of parameters and operations involved. In response to this issue, we propose SWP-Net, a lightweight facial expression recognition model meticulously designed to strike a balance between parameters and training cycles. Our SWP-Net includes four key modules: facial feature extraction, facial feature attention, stages weighted multi-layer perception, and stages weighted fusion. SWP-Net is a novel light-weighted FER model that improves efficiency while maintaining a ccuracy. With only 14.16 million parameters and 2.14 GMACs (Giga Multiply accumulate operation per Second), SWP-Net achieves competitive accuracies on two public benchmark FER datasets, reaching 89.6% and 62.3% on RAF-DB and AffectNet, respectively. The source code of our model will be made available on GitHub https://github.com/nutcliu2507/SWP-Stages-Weighted-Pooling-CNN-with-FER.