Word Meaning Representation in Neural Language Models : Lexical Polysemy and Semantic Relationships. (Représentation du sens des mots dans les modèles de langue neuronaux : polysémie lexicale et relations sémantiques)

Abstract: Word embedding representations generated by neural language models encode rich information about language and the world. In this thesis, we investigate the knowledge about word meaning encoded in embedding representations and propose methods to automatically enhance their quality. Our main focus is on contextual models which generate representations that capture the meaning of word usages in new contexts. These models have dominated the NLP and Computational Linguistics fields and open exciting new possibilities for lexical semantics research. The central axis of our research is the exploration of the knowledge about lexical polysemy encoded in word embedding models. We access this knowledge through usage similarity experiments and automatic substitute annotations assigned by the models to words in context. We study the representations produced by the models in their raw form, and explore the impact that their enrichment with external semantic knowledge has on their quality. We evaluate the representations intrinsically on the tasks of usage similarity estimation, word sense clusterability and polysemy level prediction. Additionally, we employ contextualised representations for detecting words’ semantic relationships, specifically addressing the relative intensity of scalar adjectives. Adopting an interpretation stance, we investigate the knowledge that the models encode about noun properties as expressed in their adjectival modifiers, and the entailment properties of adjective-noun constructions. Our experiments involve a wide range of contextualised models which we compare to models that produce static word representations. The majority of our analyses address English but we also test our assumptions and methodology in a multilingual setting which involves monolingual and multilingual models in other languages. Our results demonstrate that contextualised representations encode rich knowledge about word meaning and semantic relationships acquired during model training and further enriched with information from new contexts of use. We also find that the constructed semantic space encodes abstract semantic notions, such as the notion of adjective intensity, which can be useful for intrinsic lexical semantic analysis and in downstream applications. Our proposed methodology can be useful for exploring other intrinsic semantic properties of words and their semantic relationships in different languages, leading to a better understanding of the knowledge about language encoded in neural language models.
0 Replies
Loading