Abstract: We introduce a comprehensive evaluation benchmark for Polish Word Sense Disambiguation task. The benchmark consists of 7 distinct datasets with sense annotations based on plWordNet–4.2. As far as we know, our work is a first attempt to standardise existing sense annotated data for Polish. We also follow the recent trends of neural WSD solutions and we test transfer learning models, as well as hybrid architectures combining lexico-semantic networks with neural text encoders. Finally, we investigate the impact of bilingual training on WSD performance. The bilingual model obtains new State of the Art performance in Polish WSD task.
Loading