Contextualizing Parental Behaviors in Bilingual Datasets from In-Person and Telehealth Language Assessment

ACL ARR 2025 May Submission4776 Authors

20 May 2025 (modified: 03 Jul 2025)ACL ARR 2025 May SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Abstract: The increasing adoption of telehealth technologies presents both opportunities and challenges, offering greater convenience for patients while increasing clinicians’ workload, particularly in managing remotely collected data. Bilingual speech-language pathologists (SLPs) spent substantial effort in evaluating parent behaviors when conducting family-centered language assessments. In this interdisciplinary study, we collaborate with SLPs to examine how domain-specific large language models (LLMs) can support clinical workflows and provide contextualized analysis to address real-world challenges in telehealth. Our team collected a detailed bilingual dataset of 59 Mandarin-English child language assessment sessions (16 in-person and 43 via telehealth) and benchmarked three open-source LLMs and one closed-source LLM on this task. All four LLMs are still inferior to human experts despite notable accuracy, and the additional error analysis revealed that domain complexity, cultural context, and multimodal cues pose significant challenges for both LLMs and human annotators. This work highlights the need for domain-specific NLP advancement and evaluation methods that extend beyond standard benchmarks to include clinical utility, workflow integration, and cultural appropriateness in the context of bilingual telehealth assessment.
Paper Type: Long
Research Area: Special Theme (conference specific)
Research Area Keywords: multilingual benchmarks; multilingualism; multilingual evaluation
Contribution Types: Model analysis & interpretability, Data analysis
Languages Studied: English, Mandarin
Submission Number: 4776
Loading