THE-Tree: Can Tracing Historical Evolution Enhance Scientific Verification and Reasoning?

18 Sept 2025 (modified: 26 Jan 2026)ICLR 2026 Conference Withdrawn SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Scientific discovery verification;Evidence-Based Verification;Scientific Evolution
Abstract: Large Language Models (LLMs) are accelerating scientific idea generation, but rigorously evaluating these numerous, often superficial, AI-generated propositions for novelty and factual accuracy is a critical bottleneck; manual verification is too slow. Existing validation methods are inadequate: LLMs as standalone verifiers may hallucinate and lack domain knowledge (our findings show $~60\%$ unawareness of relevant papers in specific domains), while traditional citation networks lack explicit causality and narrative surveys are unstructured, underscoring the absence of structured, verifiable, and causally-linked historical data of scientific evolution. To address this, we introduce $\textbf{THE-Tree}$ ($\textbf{T}$echnology $\textbf{H}$istory $\textbf{E}$volution $\textbf{Tree}$), a computational framework that constructs such domain-specific evolution trees from scientific literature. THE-Tree employs a search algorithm to explore evolutionary paths using a novel $\textbf{``Think-Verbalize-Cite-Verify''}$ process: an LLM proposes potential advancements and cites supporting literature, while each proposed evolutionary link is validated for logical coherence and evidential support by interrogating the cited literature. We construct and validate $88$ THE-Trees across diverse domains and release a benchmark dataset including up to $71k$ fact verifications covering $27k$ papers to foster further research. Experiments demonstrate that i) in graph completion, our THE-Tree improves hit@1 by $8\%$ to $14\%$ across multiple models compared to traditional citation networks; ii) for predicting future scientific developments, it improves hit@1 metric by nearly $10\%$; and iii) when combined with other methods, it boosts the performance of evaluating important scientific papers by almost $100\%$. By constructing explicit, verifiable pathways of scientific progression, THE-Tree provides a robust historical foundation for evaluating new hypotheses (human or AI-generated) and enables a computable science history, fostering evidence-based AI-driven scientific discovery.
Primary Area: datasets and benchmarks
Submission Number: 11878
Loading