Cross-Cultural Automatic Depression Detection Based on Audio Signals

Published: 01 Jan 2024, Last Modified: 28 Mar 2025SPECOM (1) 2024EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Depression is a frequently occurring mental health disorder globally, and early detection is critical for effective treatment. In this paper, we explore the effectiveness of machine learning techniques in cross-cultural depression detection using audio signals from Chinese and English-speaking populations. We investigate the influence of temporal context length, feature sets, and classifiers on classification performance across two single and two cross-corpus settings. Our results show that hand-crafted features offer advantages in single and combined dataset settings, while deep learning-based features, particularly from emotion recognition tasks, demonstrate superior cross-dataset generalization. The optimal length of a temporal context strongly depends on the specific dataset. These findings highlight the importance of considering dataset-specific characteristics and feature selection in developing reliable and culturally adaptable models for depression detection. In the cross-corpus settings on MENHIR and CMDC datasets, we obtained the best F1 scores of 0.77 and 0.63, respectively. Future research should focus on enhancing model performance and data accessibility to ensure effective inclusion across diverse populations, ultimately contributing to better mental health outcomes globally.
Loading