UniArk: Improving Generalisation and Consistency for Factual Knowledge Extraction through DebiasingDownload PDF

Anonymous

16 Dec 2023ACL ARR 2023 December Blind SubmissionReaders: Everyone
TL;DR: We propose a paraphrased dataset ParaTrex on measuring the out-of-domian generation and an adapter-based unified framework UniArk to improve the generalisability through debiasing.
Abstract: In recent years, several works have investigated the potential of language models as knowledge bases as well as the existence of severe biases when extracting factual knowledge. In this work, we point out the inherent misalignment between pre-training and downstream tuning objectives in language models for probing knowledge under a probabilistic view and hypothesize that simultaneously debiasing these objectives can be the key to generalisation over unseen prompts. We propose an adapter-based framework \textbf{UniArk} for generalised and consistent factual knowledge extraction through simple and parameter-free methods. Extensive experiments show that UniArk can significantly improve the model's out-of-domain generalisation as well as being consistent under various prompts. Additionally, we construct a large-scale and diverse dataset ParaTrex for measuring the inconsistency and out-of-domain generation of models. Further, ParaTrex offers a reference method for constructing paraphrased datasets using large language models.
Paper Type: long
Research Area: NLP Applications
Contribution Types: Model analysis & interpretability, Approaches low compute settings-efficiency, Data resources
Languages Studied: English
0 Replies

Loading

OpenReview is a long-term project to advance science through improved peer review with legal nonprofit status. We gratefully acknowledge the support of the OpenReview Sponsors. © 2025 OpenReview