TD3Net: A temporal densely connected multi-dilated convolutional network for lipreading

Byung Hoon Lee, Wooseok Shin, Sung Won Han

Published: 2025, Last Modified: 07 Sept 2025J. Vis. Commun. Image Represent. 2025EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Highlights•TD3Net covers a wide and dense receptive field without blind spots.•Continuity loss in lip motion caused by blind spots is avoided via adaptive dilation.•Learns multi-temporal representations effectively across temporal modeling layers.•Achieves high accuracy with fewer parameters and FLOPs than previous TCN models.•Delivers significant performance improvements in word-level lipreading.

External IDs:dblp:journals/jvcir/LeeSH25