BFER-Net: Babies Facial Expression Recognition Model Using ResNet12 Enabled Few-Shot Embedding Adaptation and Convolutional Block Attention Modules

Sumiya Arafin, Adnan Ferdous Ashrafi, Md. Golam Rabiul Alam, Ashis Talukder

Published: 01 Jan 2025, Last Modified: 06 Nov 2025IEEE AccessEveryoneRevisionsCC BY-SA 4.0

Abstract: The recognition of babies’ facial expressions is challenging due to the limited availability of annotated data and the complex nature of their emotions.To address this problem, this work introduces a novel dataset, FER-BYC (Facial Expression Recognition for Bangladeshi Young Children), comprising 1,425 annotated images of babies’ facial expressions across seven emotional categories: disgust, fear, anger, happiness, neutral, sadness, and surprise. This dataset fills an enormous void in the domain, as there exist limited prior studies on babies’ facial expression detection. We’ve proposed a fusion model named BFER-Net. Here, we have deployed the feature extraction process. A Convolutional Block Attention Module (CBAM) was integrated into the Modified ResNet12 architecture. It allows the model to focus on the most relevant facial features. The implementation of few-shot learning techniques, like FEAT (Few-shot Embedding Adaptation with Transformer), Modified ResNet12, and Prototypical Networks, are especially appropriate for the small dataset size. Additionally, the proposed approach has been evaluated on the FER-BYC dataset, exhibiting 94.06% validation accuracy, which is better than the performance of traditional methods, and gaining higher classification accuracy. This research not only introduces new dataset but also gives a robust technique for baby facial expression recognition.

External IDs:doi:10.1109/access.2025.3545759