Abstract: Paraphrase generation is a fundamental task in natural language processing. In this work, we study diverse paraphrase generation, and propose a novel method to increase surface-form diversity while maintaining semantic similarity for the generated paraphrase. Our method disentangles the generation into syntax structure planning and semantic realization, which first produces a syntax tree as high-level guidance and then generates surface form of paraphrase conditioned on the syntax tree. We further introduce a diversity-driven calibration loss to rank the probability of model generated sequences and enhance the output diversity. We evaluate our method on both ParaNMT dataset and a newly proposed DiverseQuora dataset, and our model outperforms strong baselines with better quality and diversity on both datasets.
Paper Type: short
Research Area: Generation
Contribution Types: NLP engineering experiment, Data resources
Languages Studied: English
Consent To Share Submission Details: On behalf of all authors, we agree to the terms above to share our submission details.
0 Replies
Loading