Audio-visual speech recognition based on regulated transformer and spatio-temporal fusion strategy for driver assistive systems

Published: 01 Jan 2024, Last Modified: 07 May 2025Expert Syst. Appl. 2024EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Highlights•A novel transformer-based method for audio–visual speech command recognition.•Novel fusion strategies of audio–visual features and classifier ensemble.•An attention visualization approach for audio–visual feature impact assessment.•A software application of the transformer-based method for driver assistive systems.
Loading