FacePhi: Lightweight Multimodal Large Language Model for Facial Landmark Emotion Recognition

Published: 05 Mar 2024, Last Modified: 12 May 2024PML4LRS PosterEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Large Language Model, Facial Landmark, Emotion Recognition
Abstract: We introduce FacePhi, a multimodal large language model (LLM) for emotion recognition through facial landmarks. By focusing on facial landmarks, FacePhi ensures privacy preservation in emotion detection tasks. FacePhi is optimized for computational efficiency by incorporating Phi-2, a LLM with a small number of parameters, as well as utilizing lightweight facial landmark data. This design choice makes FacePhi suitable for deployment in resource-constrained settings. Our investigation highlights the importance of feature alignment during the training phase, indicating its pivotal role in enhancing the model's performance for the challenging task of facial landmark emotion recognition.
Submission Number: 48
Loading