Abstract: Current methods used to diagnose or monitor dementia-related cognitive decline predominantly rely on audio recordings.
Such audio recordings can leak personally identifiable information and create new risks given deep fake technology.
We introduce generative likelihood-based approaches to identify differences in healthy versus dementia-diagnosed participants via gaze tracking and text transcriptions during a standard diagnostic image description task without relying on sensitive audio information.
Contrasting conventional wisdom, we find that text transcriptions alone are not a reliable measure of cognitive impairment in this task, finding gaze tracking to be more reliable, and suggesting existing results in language-based dementia detection rely primarily on audio signals.
Paper Type: Short
Research Area: Multimodality and Language Grounding to Vision, Robotics and Beyond
Research Area Keywords: Multimodality and Language Grounding to Vision, Robotics and Beyond
Contribution Types: Approaches to low-resource settings, Data analysis
Languages Studied: English
Submission Number: 4288
Loading