Paper Link: https://openreview.net/forum?id=bmO0jrA2lP
Paper Type: Long paper (up to eight pages of content + unlimited references and appendices)
Abstract: Goal-oriented dialogue systems face a trade-off between fluent language generation and task-specific control. While supervised learning with large language models is capable of producing realistic text, how to steer such responses towards completing a specific task without sacrificing language quality remains an open question. In this work, we formulate goal-oriented dialogue as a partially observed Markov decision process, interpreting the language model as a representation of both the dynamics and the policy. This view allows us to extend techniques from learning-based control, such as task relabeling, to derive a simple and effective method to finetune language models in a goal-aware way, leading to significantly improved task performance. We additionally introduce a number of training strategies that serve to better focus the model on the task at hand. We evaluate our method, Context-Aware Language Models (CALM), on a practical flight-booking task using AirDialogue. Empirically, CALM outperforms the state-of-the-art method by 7% in terms of task success, matching human-level task performance.
Presentation Mode: This paper will be presented in person in Seattle
Virtual Presentation Timezone: UTC-7
Copyright Consent Signature (type Name Or NA If Not Transferrable): Charlie Snell
Copyright Consent Name And Address: Rm 8056, Berkeley Way West 2121 Berkeley Way Berkeley, CA 94704
0 Replies
Loading