On Effectively Learning of Knowledge in Continual Pre-trainingDownload PDF

Anonymous

08 Mar 2022 (modified: 05 May 2023)NAACL 2022 Conference Blind SubmissionReaders: Everyone
Paper Link: https://openreview.net/forum?id=FmJgc8RK0l
Paper Type: Long paper (up to eight pages of content + unlimited references and appendices)
Abstract: Pre-trained language models (PLMs) like BERT have made significant progress in various downstream NLP tasks. However, recent works find that PLMs are short in acquiring knowledge from the unstructured text, by asking models to do cloze-style tests. To understand the internal behavior of PLMs in retrieving knowledge, we firstly define knowledge-baring tokens and knowledge-free tokens for unstructured text and manually label on some samples. Then, we find that PLMs are more likely to predict incorrectly on knowledge-baring tokens and attend less attention to those tokens inside the self-attention module. Based on these observations, we develop two solutions to help the model learn more knowledge from the unstructured text. Experiments on knowledge-intensive tasks show the effectiveness of the proposed methods, achieving up to 6.7 points absolute improvement in the LAMA Probing, the Closed-book QA and the Knowledge Graph Reasoning tasks.
0 Replies

Loading