The student becomes the master: Outperforming GPT3 on Scientific Factual Error Correction

Dhananjay Ashok; Atharva Kulkarni; Hai Pham; Barnabas Poczos

The student becomes the master: Outperforming GPT3 on Scientific Factual Error Correction

Dhananjay Ashok, Atharva Kulkarni, Hai Pham, Barnabas Poczos

Published: 07 Oct 2023, Last Modified: 01 Dec 2023EMNLP 2023 FindingsEveryoneRevisionsBibTeX

Submission Type: Regular Long Paper

Submission Track: Commonsense Reasoning

Submission Track 2: NLP Applications

Keywords: Factual Error Correction, GPT, Domain Adaptation, Distribution Shift

TL;DR: We introduce a scientific claim correction system that makes no domain assumptions and does not require a verifier but is able to outperform existing methods by an order of magnitude

Abstract: Due to the prohibitively high cost of creating error correction datasets, most Factual Claim Correction methods rely on a powerful verification model to guide the correction process. This leads to a significant drop in performance in domains like Scientific Claim Correction, where good verification models do not always exist. In this work we introduce SciFix, a claim correction system that does not require a verifier but is able to outperform existing methods by a considerable margin — achieving correction accuracy of 84% on the SciFact dataset, 77% on SciFact-Open and 72.75% on the CovidFact dataset, compared to next best accuracies of 7.6%, 5% and 15% on the same datasets respectively. Our method leverages the power of prompting with LLMs during training to create a richly annotated dataset that can be used for fully supervised training and regularization. We additionally use a claim-aware decoding procedure to improve the quality of corrected claims. Our method outperforms the very LLM that was used to generate the annotated dataset — with FewShot Prompting on GPT3.5 achieving 58%, 61% and 64% on the respective datasets, a consistently lower correction accuracy, despite using nearly 800 times as many parameters as our model.

Submission Number: 299

Loading