Keywords: NLP, MT, Dyslexia, Disparities, LLM
TL;DR: Demonstrating performance disparities between dyslexia-style text and typical text in popular translation services
Abstract: Dyslexia is a neurodivergence that impacts one's ability to process and produce textual information. While previous research has identified unique patterns in the writings of people with dyslexia - such as letter swapping and homophone confusion - that differ themselves from the text typically used in the training and evaluation of common natural language processing (NLP) systems such as machine translation (MT), it is unclear how current state-of-the-art NLP systems perform for users with dyslexia. In this work, we explore this topic through a systematic audit of the performance of commercial MT services using synthetic dyslexia data. By injecting common dyslexia-style writing errors into popular benchmarking datasets, we benchmark the performance of three commercial MT services and one large language model (LLM) with various types and quantities of dyslexia-style errors and show a substantial disparity in MT quality for dyslexic and non-dyslexic text. While people with dyslexia often rely on modern NLP tools as assistive technologies, our results shed light on the fairness challenges experienced by this demographic with popular NLP services, highlighting the need to develop more inclusive and equitable NLP models for users with diverse language use patterns.
Supplementary Material: pdf
Submission Number: 1891
Loading