Abstract: We present polT - a general purpose text-to-text model for Polish that can be fine-tuned on a variety on Natural Language Processing (NLP) tasks with a single training objective. Unsupervised denoising pre-training is performed efficiently by initializing the model weights with multi-lingual T5 (mT5) counterpart. We evaluate performance of polT, mT5, Polish BART (plBART) and Polish GPT-2 (papuGaPT2) on diverse downstream tasks such as: text-to-text KLEJ benchmark, en-pl machine translation, question answering and summarization. The polT scores top on all of this tasks except summarization where plBART is best. In general (except summarization), the larger the model the better the results. The encoder-decoder architectures prove to be better than decoder-only equivalent. Additionally, since summarization and question answering lack benchmark datasets for Polish language we describe in detail their construction and will make them publicly available.
0 Replies
Loading