Improvements on a Multi-task BERT Model

Mahmut Agral; Selma Tekir

Improvements on a Multi-task BERT Model

Mahmut Agral, Selma Tekir

Published: 01 Jan 2024, Last Modified: 15 Oct 2024SIU 2024EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Pre-trained language models have introduced significant performance boosts in natural language processing. Finetuning of these models using downstream tasks’ supervised data further improves the acquired results. In the fine-tuning process, combining the learning of tasks is an effective approach. This paper proposes a multi-task learning framework based on BERT. To accomplish the tasks of sentiment analysis, paraphrase detection, and semantic text similarity, we include linear layers, a Siamese network with cosine similarity, and convolutional layers to the appropriate places in the architecture. We conducted an ablation study using Stanford Sentiment Treebank (SST), Quora, and SemEval STS datasets for each task to test the framework and its components’ effectiveness. The results demonstrate that the proposed multi-task framework improves the performance of BERT. The best results obtained for sentiment analysis, paraphrase detection, and semantic text similarity are accuracies of 0.534 and 0.697 and a Pearson correlation coefficient of 0.345.

Loading