Adversarial Feature Disentanglement Framework for Voice Pathology Detection

Published: 01 Jan 2025, Last Modified: 28 Jul 2025ICASSP 2025EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Voice pathology detection plays an important role in diagnosis and medical intervention. Existing methods suffer from inferior performance with limited and imbalanced training samples since the pathological information is coupled with other linguistic and paralinguistic attributes. In this work, a novel adversarial feature disentanglement framework is proposed to achieve the voice pathology detection task by extracting task-oriented features and optimizing feature space. Specifically, to suppress noise features in the coupled representations, an adversarial feature disentanglement mechanism is proposed to decouple pathological and non-pathological information, in which a mutual information discriminator is introduced to prevent information leakage. A classification and contrastive learning (CCL) module is designed to cluster intra-class embeddings in high-dimensional space. Experiments on the open-source SVD and FEMH datasets demonstrate that the proposed model outperforms other competitive baselines, achieving 87.94% and 92.06% accuracy, respectively. The visualization also validates the effectiveness in distinguishing pathological and non-pathological features.
Loading