Evaluation of Similarity Measures in a Benchmark for Spanish Paraphrasing Detection - 19th Mexican International Conference on Artificial Intelligence
Abstract: In this paper, we present a similarity-based approach towards paraphrase detection in Spanish. We evaluate various models for semantic similarity computation using a gold-standard paraphrase corpus. It contains one original document and paraphrased documents on different levels (low and high), and reference documents on the same topic or same vocabulary. It allows to assess the similarity between a pair of texts or individual sentences. We found that some of the similarity metrics have a larger difference when comparing paraphrased sentences than others. Finally, we obtained a threshold for each of the similarity metrics with the aim of determining a classification boundary to decide if two sentences are paraphrased.
0 Replies
Loading