Does Collaborative Human–LM Dialogue Generation Help Information Extraction from Human Dialogues?Download PDF

Anonymous

16 Dec 2023ACL ARR 2023 December Blind SubmissionReaders: Everyone
TL;DR: We introduce introduces a human-in-the-loop dialogue generation framework for information extraction in private call centers, demonstrating a 25% improvement in performance metrics by augmenting real conversations with synthetic data.
Abstract: The capabilities of pretrained language models have opened opportunities to explore new application areas, but applications involving human-human interaction are limited by the fact that most data is protected from public release for privacy reasons. Problem-solving human dialogues in real applications can be much more complex than existing Wizard-of-Oz collections, preventing successful domain transfer. To support information extraction (IE) for a private call center dataset, we introduce a human-in-the-loop dialogue generation framework capable of synthesizing realistic dialogues. In IE experiments with auto insurance call center dialogues, we observe 25% improvement in $F_1$ after augmenting a small set of real human conversations with synthetic data. We release code and our synthetic dataset to illustrate the complexity of real-world call center conversations and encourage development of complex dialogue datasets that are more representative of natural data.
Paper Type: long
Research Area: NLP Applications
Contribution Types: Approaches to low-resource settings, Data resources
Languages Studied: English
0 Replies

Loading

OpenReview is a long-term project to advance science through improved peer review with legal nonprofit status. We gratefully acknowledge the support of the OpenReview Sponsors. © 2025 OpenReview