Using Discourse Information for Paraphrase ExtractionDownload PDFOpen Website

2012 (modified: 10 Nov 2022)EMNLP-CoNLL 2012Readers: Everyone
Abstract: Previous work on paraphrase extraction using parallel or comparable corpora has generally not considered the documents' discourse structure as a useful information source. We propose a novel method for collecting paraphrases relying on the sequential event order in the discourse, using multiple sequence alignment with a semantic similarity measure. We show that adding discourse information boosts the performance of sentence-level paraphrase acquisition, which consequently gives a tremendous advantage for extracting phrase-level paraphrase fragments from matched sentences. Our system beats an informed baseline by a margin of 50%.
0 Replies

Loading