Evaluating Multilingual Human-like Conversations of LLMs in Social Communications

Evaluating Multilingual Human-like Conversations of LLMs in Social Communications

ACL ARR 2026 January Submission9321 Authors

06 Jan 2026 (modified: 20 Mar 2026)ACL ARR 2026 January SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Interactive dialogue evaluation, multilingual LLM dialogue, human-like LLM, evaluation and metircs

Abstract: Current Large Language Models (LLMs) evaluations rely heavily on static benchmarks, often failing to capture the interaction essential for human-like communication during multi-turn continuous human-LLM conversations. We introduce a novel evaluation framework grounded in the Common European Framework of Reference for Languages (CEFR) and Social Relationship and Power Distance (SRPD) Interaction in social communications to evaluate multilingual dialogue interactions. Unlike static metrics, our approach analyzes emergent behaviors, such as repair and alignment in dynamic, multi-turn interactions without manual annotation. Validated across diverse 18 languages, from high-resources (e.g., Spanish, French, English) to low-resources languages (e.g., Bengali, Thai, Swahili), the framework aligns with established statistic baselines results while uncovering critical behavioral nuances in lower-resource settings that static evaluations miss. This work provides a scalable methodology for measuring how effectively models adapt to user languages and domain-specific contexts in social contexts from more dynamic interaction evaluations.

Paper Type: Long

Research Area: Computational Social Science, Cultural Analytics, and NLP for Social Good

Research Area Keywords: human-like LLM, dialogue evaluation, multilingual LLM cultural analysis; Multilingual dialogue evaluation

Contribution Types: Model analysis & interpretability, Reproduction study, Approaches to low-resource settings, Approaches low compute settings-efficiency, Data resources, Data analysis

Languages Studied: Arabic, English, Portuguese, French, Italian, Turkish, Hindi, Mandarin, Japanese, Vietnamese, Thai, Swahili, Bengali, Indonesian, Spanish, Yoruba

Submission Number: 9321

Loading