STCON NIST SRE24 System: Composite Speaker Recognition Solution for Challenging Scenarios

Published: 2025, Last Modified: 21 Jan 2026INTERSPEECH 2025EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: This paper addresses the real-world challenges of voice biometrics, specifically those highlighted by the National Institute of Standards and Technology Speaker Recognition (SR) Evaluation 2024 challenge (NIST SRE24). We present multi-module SR systems integrating speaker diarization, robust embedding extraction, and adaptive scoring to handle multi-speaker recordings, variable speech segment durations, and cross-channel/cross-lingual speaker recognition scenarios. We propose methodology for training robust speaker embedding extractors and explore a range of state-of-the-art SR neural network architectures. We focus on the combined optimization of traditionally separate components – diarization and speaker recognition – and present practical observations regarding this integrated approach. The presented findings allowed our systems to secure top leaderboard positions in the NIST SRE24.
Loading