Enhancing LLM Factuality for Structured Data

15 Sept 2025 (modified: 25 Nov 2025)ICLR 2026 Conference Withdrawn SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: factuality, structured data
TL;DR: Enhancing factuality of LLMs using structured data. We propose a methodology to produce robust and realistic data for both evaluation and model improvement, starting from an input knowledge base.
Abstract: Large language models (LLMs) are typically optimized to process and output high-quality unstructured text, demonstrating remarkable capabilities in a variety of natural language tasks. Yet in practical settings, many domains, such as safety-critical or enterprise applications, rely on structured data. Improving the factuality of contemporary LLMs in these scenarios remains an open challenge, given their propensity to hallucinate or generate incorrect responses. In this work, we propose a methodology to enhance the factuality of LLMs using structured data. Specifically, we utilize an input knowledge base to generate type-constrained negative samples, and then we seed these samples to a novel verbalization procedure that generates longer context paragraphs. We demonstrate that our generated examples provide a more realistic and effective basis for both factuality evaluation and model improvement.
Primary Area: reinforcement learning
Submission Number: 6373
Loading