BEAR: A Unified Framework for Evaluating Relational Knowledge in Causal and Masked Language ModelsDownload PDF

Anonymous

16 Dec 2023ACL ARR 2023 December Blind SubmissionReaders: Everyone
Abstract: Knowledge probing is used to assess to which degree a language model (LM) has successfully learned factual, relational knowledge during its pre-training. They are used as an inexpensive way to compare LMs of different sizes and different learning parameters. However, previous probes rely on the objective function used to pre-train an LM, and are thus applicable only to either masked or causal LMs. This renders a comparison across different types of LM impossible. To address this, we propose an approach that uses an LMs' inherent ability to estimate the log-likelihood of any given textual statement. We carefully design an evaluation dataset of 40,916 relation instances from which we produce alternative statements for each relational fact, one of which is correct. We then evaluate whether the LM correctly assigns the lowest log-likelihood to the correct statement. Our experimental evaluation of 13 common LMs shows that our proposed framework, BEAR, can effectively probe for knowledge across different LM types. We release BEAR as an open source framework to the research community to facilitate evaluation and development of LMs.
Paper Type: long
Research Area: Resources and Evaluation
Contribution Types: Model analysis & interpretability, NLP engineering experiment, Publicly available software and/or pre-trained models, Data resources
Languages Studied: English
0 Replies

Loading