Fine-tuning TitaNet-Large Model for Speaker Anonymization Attacker Systems

Candy Olivia Mawalim, Aulia Adila, Masashi Unoki

Published: 2025, Last Modified: 27 Feb 2026ICASSP 2025EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Speaker anonymization techniques are crucial for safeguarding user privacy in voice-based applications. However, these methods are susceptible to adversarial attacks that can compromise their effectiveness. This paper proposes attacker systems that leverage the power of fine-tuned TitaNet-Large and ECAPA-TDNN models to identify the original speaker from anonymized speech generated by various anonymization methods. Both pre-trained models are renowned for their state-of-the-art ability to extract robust speaker embeddings. Finetuning these models with anonymized speech enables them to identify underlying patterns in anonymized speech. We evaluated the proposed attacker systems against multiple anonymization techniques that performed effectively in a series of voice privacy challenges. Our experimental results underscore the effectiveness of the fine-tuned TitaNet-Large model in breaking through these anonymization methods, as indicated by the reduced equal error rate (EER). This highlights the importance of robust and adaptive anonymization strategies to counter such emerging semiinformed threats.

External IDs:dblp:conf/icassp/MawalimAU25