One-class network leveraging spectro-temporal features for generalized synthetic speech detection

Jiahong Ye, Diqun Yan, Songyin Fu, Bin Ma, Zhihua Xia

Published: 2025, Last Modified: 22 Jul 2025Speech Commun. 2025EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Highlights•We designed a spectro-temporal network to capture F0 subband differences.•We improved OC-Softmax with the KoLeo regularizer to enhance intra-class balance.•Experiments on ASVspoof 2019 LA and 2021 LA show better results of our method.