Divide-and-Conquer Text Simplification by Scalable Data EnhancementDownload PDF

Anonymous

08 Mar 2022 (modified: 05 May 2023)NAACL 2022 Conference Blind SubmissionReaders: Everyone
Paper Link: https://openreview.net/forum?id=8FsctOOUuh7
Paper Type: Short paper (up to four pages of content + unlimited references and appendices)
Abstract: Text simplification, whose aim is to reduce reading difficulty, can be decomposed into four discrete rewriting operations: substitution, deletion, reordering, and splitting. However, due to a large distribution discrepancy between existing training data and human-annotated data, models may learn improper operations, thus lead to poor generalization capabilities. In order to bridge this gap, we propose a novel data enhancement method, Simsim, that generates training pairs by simulating specific simplification operations. Experiments show that the models trained with Simsim outperform multiple strong baselines and achieve the better SARI on the Turk and Asset datasets. The newly constructed dataset Simsim is available at *.
0 Replies

Loading