Data Augmentation for Low-Resource Dialogue SummarizationDownload PDF

Anonymous

08 Mar 2022 (modified: 05 May 2023)NAACL 2022 Conference Blind SubmissionReaders: Everyone
Paper Link: https://openreview.net/forum?id=8KHmFphT9BA
Paper Type: Short paper (up to four pages of content + unlimited references and appendices)
Abstract: We present DADS, a novel Data Augmentation technique for low-resource Dialogue Summarization. Our method generates synthetic examples by replacing sections of text from both the input dialogue and summary while preserving the augmented summary to correspond to a viable summary for the augmented dialogue. We utilize pretrained language models that produce highly likely dialogue alternatives while still being free to generate diverse alternatives. We applied our data augmentation method to the SAMSum dataset in low resource scenarios, mimicking real world problems such as chat, thread, and meeting summarization where large scale supervised datasets with human-written summaries are scarce. Through both automatic and human evaluations, we show that DADS shows strong improvements for low resource scenarios while generating topically diverse summaries without introducing additional hallucinations to the summaries.
Presentation Mode: This paper will be presented virtually
Virtual Presentation Timezone: UTC-7
Copyright Consent Signature (type Name Or NA If Not Transferrable): Joshua Maynez
Copyright Consent Name And Address: Google, 6 Pancras Sq, London N1C 4AG
0 Replies

Loading