AP-BERT: enhanced pre-trained model through average pooling

Shuai Zhao, Tianyu Zhang, Man Hu, Wen Chang, Fucheng You

Published: 01 Jan 2022, Last Modified: 27 Jun 2023Appl. Intell. 2022Readers: Everyone

Abstract: BERT, a pre-trained language model on the large-scale corpus, has made breakthrough progress in NLP tasks. However, the experimental data shows that the BERT model’s application effect in Chinese tasks is not ideal. The reason is that we believe that only character-level embedding can be obtained through BERT. However, a single Chinese character often cannot express their comprehensive meaning. To improve the model’s ability to understand phrase-level semantic information, this paper proposes an enhanced BERT based on the average pooling(AP-BERT). Our model uses an average pooling layer to act on token embedding and reconstructs the model’s input embedding, which can effectively improve BERT’s application effect in Chinese natural language processing. Experimental data show that our proposed method has been enhanced in the four tasks of Chinese text classification, named entity recognition, reading comprehension, and summary generation. This method can not only improve the application effect of the BERT model in Chinese tasks but also can be well applied to other pre-trained language models.

0 Replies