FactNLI: Dynamic and Automated Fact-Enhanced Augmentation of NLI Benchmarks

Yihan Lu; Junzhe Zhang; Huixuan Zhang; Xiaojun Wan

FactNLI: Dynamic and Automated Fact-Enhanced Augmentation of NLI Benchmarks

Yihan Lu, Junzhe Zhang, Huixuan Zhang, Xiaojun Wan

18 Sept 2025 (modified: 12 Feb 2026)ICLR 2026 Conference Desk Rejected SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Natural Language Inference, Benchmark Construction

Abstract: Natural Language Inference (NLI) is a core task for language understanding, yet existing NLI datasets are static and no longer challenging, allowing current Large Language Models (LLMs) to perform well without truly revealing their capabilities and shortcomings. To address this problem, we propose a new data augmentation framework to automatically build more challenging NLI datasets based on existing datasets, by iteratively fusing rich facts into the premise and hypothesis of an NLI instance. We use a strict fact filter to ensure that fused facts are non-contradictory and non-redundant. Applied to SNLI and MNLI, our augmentation substantially increases data length and complexity, and the performance of a range of LLMs on the augmented datasets drops significantly (up to 30%). Ablation experiments and human quality checks confirm the high quality of the generated data.

Supplementary Material: zip

Primary Area: datasets and benchmarks

Submission Number: 10938

Loading