Visual Language Identification from Facial Landmarks

Radim Spetlík; Jan Cech; Vojtech Franc; Jiri Matas

Visual Language Identification from Facial Landmarks

Radim Spetlík, Jan Cech, Vojtech Franc, Jiri Matas

Published: 01 Jan 2017, Last Modified: 29 Oct 2024SCIA (2) 2017EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: The automatic Visual Language IDentification (VLID), i.e. a problem of using visual information to identify the language being spoken, using no audio information, is studied. The proposed method employs facial landmarks automatically detected in a video. A convex optimisation problem to find jointly both the discriminative representation (a soft-histogram over a set of lip shapes) and the classifier is formulated. A 10-fold cross-validation is performed on dataset consisting of 644 videos collected from youtube.com resulting in accuracy 73% in a pairwise discrimination between English and French (50% for a chance). A study, in which 10 videos were used, suggests that the proposed method performs better than average human in discriminating between the languages.

Loading