Abstract: Classifying subtypes of primary progressive aphasia (PPA) from connected speech presents significant diagnostic challenges due to overlapping linguistic markers. This study benchmarks the performance of traditional machine learning models with various feature extraction techniques, transformer-based models, and large language models (LLMs) for PPA classification. Our results indicate that while transformer-based models and LLMs exceed chance-level performance in terms of balanced accuracy, traditional classifiers combined with contextual embeddings remain highly competitive. Notably, MLP using MentalBert’s embeddings achieves the highest accuracy. These findings underscore the potential of machine learning for enhancing the automatic classification of PPA subtypes.
Paper Type: Long
Research Area: NLP Applications
Research Area Keywords: clinical NLP, healthcare applications, clinical text analysis, Neurodegenerative Diseases, Machine Learning, LLMs, Transformer Models
Contribution Types: NLP engineering experiment
Languages Studied: English
Submission Number: 3943
Loading