Microphone array geometry-independent multi-talker distant ASR: NTT system for DASR task of the CHiME-8 challenge
Abstract: Highlights•A multi-talker ASR system achieving 63% relative macro tcpWER improvement over the CHiME-8 DASR task baseline.•A powerful diarization frontend combining EEND-VC, TS-VAD, and multi-channel speaker counting.•Speech enhancement using improved microphone selection and SP-MWF beamformer.•Four ASR backends exploiting speech foundation models (Whisper and WavLM).•An extensive experimental study and ablation analyzing each component of our system.
External IDs:dblp:journals/csl/KamoTAKSIMHMOPAOMDNAA26
Loading