Abstract: Annotated dialogue datasets are crucial for training models related to task-oriented systems, but such datasets are often scarce. Despite the existence of automated approaches for generating dialogue data, generating high-quality dialogue and accurate annotation remains a challenge. To address this issue, we propose Few-Shot Controlled Dialogue Generation Using In-Context Learning (ICI-CDG). It uses turn-level dialogue retrieval to enhance the in-context learning ability of large language models, enabling the rapid and automatic generation of high-quality controllable dialogues. The ICI-CDG consists of three modules: goal generation, turn-level dialogue retrieval, and state matching filtering. The goal generation generates new dialogue overall goals by randomly sampling from diverse dialogues, and these goals provide a clear direction and framework for the generation of new dialogues. The turn-level dialogue retrieval searches for turns with higher similarity as prompt, improving generated dialogue quality. The state matching filtering looks for generated content corresponding to the turn goals to reduce the semantic deviation between the annotation and the dialogue. The experimental results on the two datasets show that our method generates more natural dialogues with more accurate annotations, outperforming existing methods in few-shot settings.
Loading