Beyond the Turn-Based Game: Duplex Models Enable Real-Time Conversations

ACL ARR 2024 April Submission30 Authors

09 Apr 2024 (modified: 22 May 2024)ACL ARR 2024 April SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Abstract: As large language models (LLMs) increasingly permeate daily life, there is a growing demand for interactions that mirror human conversation in real time. Traditional LLM-based chat systems are turn-based, preventing users from interacting verbally with the model while it generates output. To overcome these limitations, we introduce \textbf{duplex models}, which can receive inputs from users \textit{while} generating outputs and adjust dynamically to instant user feedback such as interruptions. To endow model LLM architectures with such characteristics, we utilize a time-segment decoding strategy that enables the model to process inputs and generate responses pseudo-simultaneously. Furthermore, to make the LLMs proficient in handling real-time conversations, we construct a fine-tuning dataset with interleaved pieces of time-segmented input and output and include typical types of feedback in instantaneous interactions. In the experiments, we find that although the inputs and outputs are segmented into incomplete pieces, the model preserves its performance on standard benchmarks with a few steps of training. Moreover, this approach makes user-AI interactions more natural and human-like, thus greatly improving user satisfaction in our user experiments. The model and dataset will be released.
Paper Type: Long
Research Area: Dialogue and Interactive Systems
Research Area Keywords: Dialogue and Interactive Systems, Human-Centered NLP, Generation, Language Modeling
Contribution Types: Publicly available software and/or pre-trained models
Languages Studied: English;Chinese
Section 2 Permission To Publish Peer Reviewers Content Agreement: Authors grant permission for ACL to publish peer reviewers' content
Submission Number: 30