Dynamic Lip Motion Analysis for Deepfake Detection

Published: 01 Jan 2025, Last Modified: 07 Nov 2025CSCS 2025EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: The proliferation of deepfake videos poses a significant threat to the integrity of digital media, given their capacity to disseminate misinformation and undermine the reliability of visual content. As synthetic audiovisual forgeries become increasingly realistic, they are being used in coordinated disinformation efforts, contributing to the broader issue of information disorder, which compromises public trust, media authenticity, and the verifiability of digital evidence. This study investigates the effectiveness of spatio-temporal features, extracted using Local Binary Patterns on Three Orthogonal Planes (LBP-TOP), in distinguishing synthetic from authentic video content. The approach focuses on the dynamic characteristics of the labial region, where inconsistencies in facial motion are more likely to occur in deepfake videos. LBP-TOP is employed to capture subtle texture and motion variations across spatial and temporal dimensions. Preliminary empirical evaluations conducted on benchmark deepfake datasets demonstrate the efficacy of the proposed approach, emphasizing the importance of localized temporal analysis in enhancing detection accuracy. The results highlight the potential of region-specific facial modelling as a computationally efficient yet discriminative strategy in the context of video forensics and the mitigation of synthetic media-based disinformation.
Loading