Wikifying Novel Words to Mixtures of Wikipedia Senses by Structured Sparse CodingOpen Website

2013 (modified: 08 Nov 2022)ICPRAM (Selected Papers) 2013Readers: Everyone
Abstract: We extend the scope of Wikification to novel words by relaxing two premises of Wikification: (i) we wikify without using the surface form of the word (ii) to a mixture of Wikipedia senses instead of a single sense. We identify two types of “novel” words: words where the connection between their surface form and their meaning is broken (e.g., a misspelled word), and words where there is no meaning to connect to—the meaning itself is also novel. We propose a method capable of wikifying both types of novel words while also dealing with the inherently large-scale disambiguation problem. We show that the method can disambiguate between up to 1,000 Wikipedia senses, and it can explain words with novel meaning as a mixture of other, possibly related senses. This mixture representation compares favorably to the widely used bag of words representation.
0 Replies

Loading