Abstract: This paper describes our submission to the ICASSP 2023 Signal Processing Grand Challenge (SPGC), which focuses on multilingual Alzheimer’s disease (AD) recognition through spontaneous speech. Our approaches include using a variety of acoustic features and silence-related information for AD detection and mini-mental state examination (MMSE) score prediction, and fine-tuning wav2vec2.0 models on speech in various frequency bands for AD detection. Our overall results on the test data outperform the baseline provided by the organizers, achieving 73.9% accuracy in AD detection by fine-tuning our bilingual wav2vec2.0 pre-trained model on the 0-1000Hz frequency band speech, and 4.610 RMSE (r = 0.565) in MMSE prediction through the fusion of eGeMAPS and silence features.
Loading