Keywords: primary progressive aphasia, semantic variant primary progressive aphasia, interpretable machine learning, clinical NLP, speech biomarkers, aphasia subtyping, scene graph semantics, phonological-semantic dissociation, multimodal speech analysis, explainable AI for healthcare
Abstract: We developed IPSS-PPA (Interpretable Phonological--Semantic Scoring), a fully automated pipeline for Primary Progressive Aphasia (PPA) subtype classification from picture description speech. The semantic variant (svPPA) is particularly difficult to detect: speech remains fluent and grammatically intact while semantic content degrades, leaving acoustic and fluency-based biomarkers poorly suited to detecting it. We addressed this by routing prediction through ten clinician-specified constructs (anomia, agrammatism, empty speech, and seven others) co-designed with SLPs, making every classification decision directly traceable to named clinical features rather than recovered post-hoc from a black-box model (Rudin, 2019; Koh et al., 2020). Phonological and semantic adequacy scores are computed from these constructs, and semi-automatic speaker diarization reduces the manual annotation bottleneck of prior work. On 254 WAB Picnic recordings, we achieve AUC 0.918 for control vs. svPPA and AUC 0.734 ($p < 10^{-5}$) for svPPA vs. nfvPPA+lvPPA, the first reported result on this clinically critical distinction.
Paper Type: Long
Research Area: Clinical and Biomedical Applications
Research Area Keywords: other clinical and biomedical applications which explicitly involve natural language as a central element in the paper contribution.
Contribution Types: Model analysis & interpretability, NLP engineering experiment, Data analysis
Languages Studied: English
EMNLP 2026 AI Reviewing Experiment: no
Submission Number: 11765
Loading