Abstract: We introduce a method for learning to embed word senses as defined in a given set of given dictionaries. In our approach, sense definition pairs, <word, definition> are transformed into low-dimension vectors aimed at maximizing the probability of reconstructing the definitions in an autoencoding setting. The method involves automatically training sense autoencoder for encoding sense definitions, automatically aligning sense definitions, and automatically generating embeddings of arbitrary description. At run-time, queries from users are mapped to the embedding space and re-ranking is performed on the sense definition retrieved. We present a prototype sense definition embedding, SenseNet, that applies the method to two dictionaries. Blind evaluation on a set of real queries shows that the method significantly outperforms a baseline based on the Lesk algorithm. Our methodology clearly supports combining multiple dictionaries resulting in additional improvement in representing sense definitions in dictionaries.
Paper Type: long
0 Replies
Loading