Eliciting interactional competence: Comparing AI and human-elicited roleplays

Yunwen Su; seungwon hong; Shengnan Xiao; Meng Zhou

Eliciting interactional competence: Comparing AI and human-elicited roleplays

Yunwen Su, seungwon hong, Shengnan Xiao, Meng Zhou

Published: 28 Apr 2026, Last Modified: 28 Apr 2026MSLD 2026 PosterEveryoneRevisionsCC BY 4.0

Keywords: interactional competence, pragmatic assessment, AI interlocutor

TL;DR: This study investigates the potential of generative AI as an alternative interlocutor by comparing the interactional competence features elicited in L2 English speakers’ roleplay conversations with AI versus with a native-speaking human peer.

Abstract: Recent research in pragmatics assessment has increasingly emphasized interactional competence (IC) as a central construct. Assessing IC typically requires tasks that elicit interactive language use, such as roleplays or paired discussions. However, including a human interlocutor in these tasks often reduces the practicality and standardization of assessment (Galaczi & Taylor, 2018; Ockey & Chukharev-Hudilainen, 2021). This study investigates the potential of generative AI as an alternative interlocutor by comparing the IC features elicited in English as a second language (ESL) speakers’ roleplay conversations with AI versus those elicited by a native-speaking human peer. Thirty-three ESL speakers completed a 6-item roleplay task targeting refusals of requests, invitations, and offers with a trained L1 English-speaking human interlocutor and an AI interlocutor (ChatGPT 4o Advanced Voice Mode) one week apart. Meanwhile, 18 L1 English speakers finished the same task in pairs and with the AI interlocutor, also one week apart. All roleplays were audio-recorded and transcribed verbatim following Conversation Analysis (CA) conventions (Jefferson, 2004). Three categories of interactional features were then coded—using Python for Length and Complexity (utterance length, utterance complexity, turn number, turn length), and manual coding for Engagement with Interaction (acknowledgement tokens, inter-turn pauses, comprehension check tokens), and Preference Organization (delay, mitigation, justification, alternative). Preliminary analyses using Generalized Mixed-Effects Models show significant effects of modality (human versus AI interlocutor) in Length and Complexity as well as Engagement with Interaction, but not in Preference Organization. The results suggest potential of using generative AI to simulate conversations with ESL speakers in pragmatics learning and testing.

Email Sharing: We authorize the sharing of all author emails with Program Chairs.

Data Release: We authorize the release of our submission and author names to the public in the event of acceptance.

Submission Number: 39

Loading