Training Bi-Encoders for Word Sense Disambiguation

Harsh Kohli

Published: 01 Jan 2021, Last Modified: 19 Feb 2025ICDAR (2) 2021EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Modern transformer-based neural architectures yield impressive results in nearly every NLP task and Word Sense Disambiguation, the problem of discerning the correct sense of a word in a given context, is no exception. State-of-the-art approaches in WSD today leverage lexical information along with pre-trained embeddings from these models to achieve results comparable to human inter-annotator agreement on standard evaluation benchmarks. In the same vein, we experiment with several strategies to optimize bi-encoders for this specific task and propose alternative methods of presenting lexical information to our model. Through our multi-stage pre-training and fine-tuning pipeline we further the state of the art in Word Sense Disambiguation.