Abstract: Recent advancements in natural language processing (NLP) have greatly improved the performance of language reasoning and generating. However, a well known shortcoming of language models is that they tend to generate information that is untrue, referred to as \emph{hallucinations}. In order to help advance the correctness of language models, we improve the performance and the computational efficiency of models trained on classifying claims as true or false. We use the \textsc{FactKG} dataset, which is constructed from the \emph{DBpedia} knowledge graph extracted from Wikipedia. We create fine-tuned text models and hybrid models using graphs and text that significantly outperform the benchmark \textsc{FactKG} models and all other known approaches, both with respect to test-set accuracy and training time. The increase in performance and efficiency stems from simplifying the methods for retrieving subgraphs, using simple logical retrievals rather than fine-tuned language models. Finally, we construct prompts to ChatGPT 4o that achieves decent performance, but without the need of fine-tuning.
Paper Type: Long
Research Area: Machine Learning for NLP
Research Area Keywords: Machine learning, knowledge graphs, fact verification, language models
Contribution Types: Model analysis & interpretability, Approaches low compute settings-efficiency
Languages Studied: English
Submission Number: 1615
Loading