Keywords: Natural language processing, transformer, foundation models, BERT
TL;DR: This paper presents a medical Danish BERT model (MeDa-BERT) trained on a new Danish medical corpus.
Abstract: This paper introduces a medical Danish BERT-based language model (MeDa-BERT) and medical Danish word embeddings. The word embeddings and MeDa-BERT were pretrained on a new medical Danish corpus consisting of 133M tokens from medical Danish books and text from the internet. The models showed improved performance over general-domain models on medical Danish classification tasks. The medical word embeddings and MeDa-BERT are publicly available.
Student Paper: Yes, the first author is a student
4 Replies
Loading