TD3Net: A temporal densely connected multi-dilated convolutional network for lipreading

Published: 01 Jan 2025, Last Modified: 07 Sept 2025J. Vis. Commun. Image Represent. 2025EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Highlights•TD3Net covers a wide and dense receptive field without blind spots.•Continuity loss in lip motion caused by blind spots is avoided via adaptive dilation.•Learns multi-temporal representations effectively across temporal modeling layers.•Achieves high accuracy with fewer parameters and FLOPs than previous TCN models.•Delivers significant performance improvements in word-level lipreading.
Loading