Abstract: We present an unsupervised phrase relatedness function (f) that has been applied in a Semantic Textual Similarity system (TrWP) of SemEval-2015. The best run of TrWP was ranked 33 among 73 runs. f finds the relatedness strength between two phrases using overlapping bi-gram context extracted from the Google-n-gram corpus. The relatedness strength is the strength of association capturing how similar or dissimilar two phrases are. In order to find the relatedness strength, f applies a sum-ratio (SR) technique based on the statistics of the overlapping n-grams associated with two input phrases. The experimental result from f demonstrates improvement over existing phrase relatedness methods on two standard datasets of 216 phrase-pairs. f does not require any human annotated resource and is independent of the syntactic structure of phrases.
0 Replies
Loading