Enhancing Logical Reasoning in Large Language Models through Graph-based Synthetic Data

Jiaming Zhou; Abbas Ghaddar; Ge Zhang; Liheng Ma; Yaochen Hu; Soumyasundar Pal; Bin Wang; Jianye HAO; Mark Coates; Yingxue Zhang

Enhancing Logical Reasoning in Large Language Models through Graph-based Synthetic Data

Jiaming Zhou, Abbas Ghaddar, Ge Zhang, Liheng Ma, Yaochen Hu, Soumyasundar Pal, Bin Wang, Jianye HAO, Mark Coates, Yingxue Zhang

Published: 24 Jul 2025, Last Modified: 04 Oct 2025XLLM-Reason-PlanEveryoneRevisionsBibTeXCC BY 4.0

Keywords: large language models, synthetic data, graph-based learning, multi-hop reasoning

TL;DR: Graph-based synthetic data and a task-specific prompting strategy significantly improve LLM logical reasoning without sacrificing general capabilities.

Abstract: Despite recent advances in training and prompt- ing strategies for Large Language Models (LLMs), these models continue to face chal- lenges with complex logical reasoning tasks that involve long reasoning chains. In this work, we explore the potential and limitations of using graph-based synthetic reasoning data as training signals to enhance LLMs’ reasoning capabilities. Our extensive experiments, con- ducted on two established natural language rea- soning tasks—inductive reasoning and spatial reasoning—demonstrate that supervised fine- tuning (SFT) with synthetic graph-based rea- soning data effectively enhances LLMs’ rea- soning performance, without compromising their effectiveness on other standard evaluation benchmarks.

Paper Published: No

Paper Category: Short Paper

Demography: Prefer not to say

Academic: Others

Submission Number: 7

Loading