Keyword Indexing System with HowNet and PageRank

Published: 01 Jan 2008, Last Modified: 19 Feb 2025ICNSC 2008EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Keyword indexing is widely used in natural language processing. This paper proposed an unsupervised keyword indexing method based PageRank and HowNet. In the method, a free text is firstly represented as a sememe graph with sememes as vertices and relatedness of sememes as weighted edges based on HowNet. Then UW-PageRank is applied on the sememe graph to score the importance of sememes. Score of each definition of one word can be computed from the score of sememes it contains. Then, the highest scored definition is assigned to the word. A sememes graph is built again only with the exact definition of each words, and use UW-PageRank again to score all the sememes and then deduced the importance of the words. Finally, the highest scored words are indexed as keywords. The experiment results prove practical and effective.
Loading

OpenReview is a long-term project to advance science through improved peer review with legal nonprofit status. We gratefully acknowledge the support of the OpenReview Sponsors. © 2025 OpenReview