GIVE: Structured Reasoning of Large Language Models with Knowledge Graph Inspired Veracity Extrapolation
TL;DR: LLMs can conduct faithful and efficient reasoning by combining its parametric knowledge and limited external information to tackle difficult scientific tasks.
Abstract: Existing approaches based on context prompting or reinforcement learning (RL) to improve the reasoning capacities of large language models (LLMs) depend on the LLMs' internal knowledge to produce reliable Chain-Of-Thought (CoT). However, no matter the size of LLMs, certain problems cannot be resolved in a single forward pass. Meanwhile, agent-based reasoning systems require access to a comprehensive nonparametric knowledge base, which is often costly or not feasible for use in scientific and niche domains. We present Graph Inspired Veracity Extrapolation (GIVE), a novel reasoning method that merges parametric and non-parametric memories to improve accurate reasoning with minimal external input. GIVE guides the LLM agent to select the most pertinent expert data ($\textbf{observe}$), engage in query-specific associative thinking ($\textbf{reflect}$), and then synthesize this information to produce the final output ($\textbf{speak}$). Extensive experiments demonstrated the following benefits of our framework: (1) GIVE increases the performance of LLMs across various sizes. (2) In some scenarios, GIVE allows smaller LLMs to surpass larger, more sophisticated ones in scientific tasks ($\textbf{GPT3.5T + GIVE > GPT4}$). (3) GIVE is effective on scientific and open-domain assessments. (4) GIVE is a training-free method that enables LLMs to tackle new problems that extend beyond their training data (up to $\textbf{43.5}$\% $\rightarrow$ $\textbf{88.2}$\% accuracy improvement). (5) GIVE allows LLM agents to reason using both restricted (very small) and noisy (very large) knowledge sources, accommodating knowledge graphs (KG) ranging from $\textbf{135}$ to more than $\textbf{840k}$ nodes. (6) The reasoning process involved in GIVE is fully interpretable. Our code is available at https://github.com/Jason-Tree/GIVE
Lay Summary: Large language models are powerful text generators but often stumble on complex questions due to lack of domain-specific knowledge. Existing fixes either lean entirely on the model’s internal “memory” or demand huge external databases—both of which can be impractical for specialized scientific topics. We introduce Graph Inspired Veracity Extrapolation (GIVE), a three-step method that lets the model tap into the right expert facts, think through them step by step, and then craft a clear answer. GIVE first observes by selecting pertinent data, then reflects through query-specific associative thinking, and finally speaks by synthesizing a coherent response. In experiments, GIVE boosts reasoning accuracy across model sizes—enabling smaller models to outperform much larger ones on scientific tasks (for example, GPT-3.5 + GIVE beats GPT-4). It works without any additional training, handles knowledge graphs from a few dozen to hundreds of thousands of nodes, and shines in both open-domain and niche scientific benchmarks. Because its simple observe-reflect-speak process is fully interpretable, GIVE offers a transparent, training-free way to give LLMs real-world reasoning power.
Application-Driven Machine Learning: This submission is on Application-Driven Machine Learning.
Link To Code: https://github.com/Jason-Tree/GIVE
Primary Area: Deep Learning->Large Language Models
Keywords: large language models, reasoning, agent, biomedical question answering, AI for science, knowledge graph
Submission Number: 8057
Loading