On Learning Frequency-Instance Correlations by Model-Agnostic Training for Synthetic Speech Detection

Published: 05 Sept 2024, Last Modified: 16 Oct 2024ACML 2024 Conference TrackEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Anti-spoofing; synthetic speech detection; consistency loss
Verify Author List: I have double-checked the author list and understand that additions and removals will not be allowed after the submission deadline.
Abstract: The goal of Synthetic Speech Detection (SSD) is to detect spoofing speech synthesized by text-to-speech and voice conversion. Most existing SSD methods focus only on mining frequency-wise dependency by customizing frequency-aggregation modules in SSD models. However, the instance-wise dependency is usually under-explored, which is critical for identifying the synthetic speech in a global view. In this paper, we propose a novel model-agnostic training strategy for SSD that exploits both local (frequency-wise) and global (instance-wise) contexts, which do not rely on a customized architecture and can be flexibly integrated into previous SSD models. Specifically, we propose an inter-frequency correlation module to capture the local context by reconstructing the masked frequency information from the unmasked frequency context. Meanwhile, an inter-instance correlation module is performed to explore the global context among different instances by promoting intra-class compactness and inter-class dispersion in the latent space. These two complementary modules operate from distinct contextual perspectives, leading to improvements in SSD performance. Extensive experiments show that our method significantly improves the performance of two state-of-the-art models on the 2019 dataset and 2021 dataset of ASVspoof.
A Signed Permission To Publish Form In Pdf: pdf
Primary Area: Deep Learning (architectures, deep reinforcement learning, generative models, deep learning theory, etc.)
Paper Checklist Guidelines: I certify that all co-authors of this work have read and commit to adhering to the guidelines in Call for Papers.
Student Author: Yes
Submission Number: 348
Loading