Developing a Lightweight Model for Lip-reading

I Wayan Wiprayoga Wisesa, Shanq-Jang Ruan

Published: 01 Jan 2023, Last Modified: 14 Nov 2023ICCE-Taiwan 2023Readers: Everyone

Abstract: Understanding speech from a silent video is possible through a lip-reading technique. Previous works in developing lip-reading models using deep learning mainly focused on improving model accuracy and increasing the model complexity. However, for limited resource devices, computational complexity is also an essential factor to be considered for evaluating the lip-reading model in addition to the model’s classification accuracy, with some acceptable trade-offs between them. Our work mainly focused on developing a low-complexity and lightweight model for the lip-reading task. This paper evaluates two lightweight lip-reading models using computationally inexpensive feature extractors and classifiers. We trained and evaluated our lightweight architectures on the LRW datasets, a popular public large-scale lip-reading dataset. The experiment result suggests that with much lower computational complexity and parameter size, our lightweight architecture is comparable with some model references for recognizing word-level English language lip-reading tasks.

0 Replies