Tuning multilingual transformers for language-specific named entity recognition
Abstract: Our paper addresses the problem of multilin-
gual named entity recognition on the mate-
rial of 4 languages: Russian, Bulgarian, Czech
and Polish. We solve this task using the
BERT model. We use a hundred languages
multilingual model as base for transfer to the
mentioned Slavic languages. Unsupervised
pre-training of the BERT model on these 4
languages allows to significantly outperform
baseline neural approaches and multilingual
BERT. Additional improvement is achieved by
extending BERT with a word-level CRF layer.
Our system was submitted to BSNLP 2019
Shared Task on Multilingual Named Entity
Recognition and took the 1st place in 3 compe-
tition metrics out of 4 we participated in. We
open-sourced NER models and BERT model
pre-trained on the four Slavic languages.
0 Replies
Loading