- Abstract: Word Embeddings are able to capture lexico-semantic information but remain flawed in their inability to assign unique representations to different senses of a polysemous words. They also fail to include information from well curated semantic lexicons and dictionaries. Previous approaches that integrate polysemy and knowledge bases fall distinctly under two categories - retrofitting vectors to ontologies or learning from sense tagged corpora. While these embeddings are superior in understanding contextual similarity, they are outperformed by single prototype word vectors on several relatedness tasks. In this work, we introduce a new approach that can induce polysemy to any pre-trained embedding space by jointly grounding contextualized sense representations with word embeddings to an ontology and a thesaurus. Along with word sense induction, the resulting representations reduces the effect of vocabulary bias that arises in natural language corpora and in turn embedding spaces. By grounding them to knowledge bases they are able to learn multi-word representations and are also interpretable. We evaluate our vectors across 12 datasets on several similarity and relatedness tasks along with two extrinsic tasks and find that our approach consistently outperforms current state of the art.
- Keywords: ontology grounding, sense representations, knowledge bases