- Abstract: Retrofitting static vector space word representations using external knowledge bases has yielded substantial improvements in their lexical-semantic capacities but is non-trivial to apply to contextual word embeddings (CWE). In this paper, we propose MAKESENSE, a method that 'approximates' retrofitting in CWEs to better infer word sense knowledge from word contexts. We specifically analyze BERT and MAKESENSE-transformed BERT representations over a diverse set of experiments encompassing sense-sensitive similarities, alignment with human-elicited similarity judgments, and probing tasks focusing on sense distinctions and hypernymy. Our findings indicate that MAKESENSE imparts substantial improvements in word sense information over vanilla CWEs but largely preserves more complex usage of sense and directionally sensitive information such as hypernymy.
- Software: zip