Extracting Sign Language Articulation from Videos with MediaPipeDownload PDF

Published: 20 Mar 2023, Last Modified: 28 Mar 2023NoDaLiDa 2023Readers: Everyone
Keywords: sign language, computer vision, mediapipe, phonology
TL;DR: MediaPipe's tracking of the hands' location and movement in videos can be used to estimate the articulation phase of signs, hand dominance, number of hands and place of articulation.
Abstract: This paper concerns evaluating methods for extracting phonological information of Swedish Sign Language signs from video data with MediaPipe's pose estimation. The methods involve estimating i) the articulation phase, ii) hand dominance (left vs. right), iii) the number of hands articulating (one- vs. two-handed signs) and iv) the sign's place of articulation. The results show that MediaPipe's tracking of the hands' location and movement in videos can be used to estimate the articulation phase of signs. Whereas the inclusion of transport movements improves the accuracy for the estimation of hand dominance and number of hands, removing transport movements is crucial for estimating a sign's place of articulation.
3 Replies
