Abstract: We introduce Knowledge-enhanced Spherical Representation Learning (K-SRL), a generative probabilistic model of text documents that combines word embeddings and knowledge graph embeddings to effectively encode the semantic information of text and the related background knowledge into a low-dimensional representation. More specifically, the proposed model represents each text document as a combination of both words and entities linked to an external large knowledge graph and models them as points on the unit hypersphere using the von Mises-Fisher distribution. Furthermore, we develop an efficient variational Bayesian inference algorithm to learn unsupervised text embeddings in the spherical space. Experimental results on multiple benchmark datasets demonstrate that our model outperforms existing probabilistic models on common text classification tasks, including text categorization and sentiment analysis.
0 Replies
Loading