Keywords: code generation, LLMs, muli-turn interaction
Abstract: Large Language Models (LLMs) is quickly arising as a partner for users to solve complex task through multiple interaction turns. To study such interaction, we introduce \ourtask, a task and evaluation framework for multi-turn User-LLM interaction for collaborative coding task, where a user work with an LLM assistant to design a website. Most existing LLM assistant work study single-initiative settings, where the LLM assistant generates only output attempts or only clarifying questions to ask the user. We demonstrate both are suboptimal: attempting to predict at every turn is inefficient, as it significantly increases interaction length. Asking questions at every turn is ineffective, as LLM are not very good at asking good clarifying questions consecutively without attempting the task. Given these tradeoffs, we propose mixed-initiative interactions, where LLM alternates between generate clarifying questions and attempting an output, achieving 99% of the output quality from such single-initiative interactions with conversations that are only 55% as long. Lastly, we investigate why mixed-initiative interactions are so effective, demonstrating that mixed-initiative interactions can lead to more helpful user answers to clarifying questions and more efficient communication between the user and assistant.
Primary Area: foundation or frontier models, including LLMs
Submission Number: 21281
Loading