Abstract: Large Language Models (LLMs) are used by many to generate code in a pair-programming-like setting. However, intents in users' coding instructions are ambiguous, and models are limited in their ability to use dialogue to disambiguate intent to produce unambiguous code.
This presents a fundamental difficulty in code generation, wherein the ambiguity in natural language can lead to seemingly correct programs that are different from the intended. We propose to use dialogue to reduce this ambiguity, specifically in the plotting domain, and contribute an analysis of the different types of ambiguity that may exist in multi-modal code generation. Based on our analysis, we propose different pragmatic models to inform dialogue strategies for ambiguity resolution, including those based on Rational Speech Acts (cooperative), Discourse Theory (discoursive), and Questions under Discussion (inquisitive). Finally, we compare these dialogue strategies in a simulated dialogue setting — operationalizing the pragmatic models via prompting. Our findings suggest that discoursive and cooperative reasoning styles show the best results regarding executability and disambiguation, while inquisitive reasoning performs the best in disambiguation for vagueness. These suggest that simulated dialogues with pragmatic frameworks can resolve ambiguities in code generation.
Paper Type: Long
Research Area: Multimodality and Language Grounding to Vision, Robotics and Beyond
Research Area Keywords: code generation, dialogue, discourse, pragmatics, ambiguity
Contribution Types: Publicly available software and/or pre-trained models, Data resources, Data analysis
Languages Studied: English
Submission Number: 1852
Loading