Keywords: LLM agents, reinforcement learning, curriculum learning, graph-augmented reasoning, tool use
Abstract: Large language models (LLMs) increasingly rely on external knowledge to improve factuality, yet many real-world knowledge sources are organized as heterogeneous graphs rather than plain text.
Reasoning over such graph-structured knowledge poses two key challenges:
(1) navigating structured, schema-defined relations requires precise function calls rather than similarity-based retrieval, and (2) answering complex questions often demands multi-hop evidence aggregation through iterative information seeking.
We propose GraphDancer, a reinforcement learning (RL) framework that teaches LLMs to navigate graphs by interleaving reasoning and function execution.
To make RL effective for moderate-sized LLMs, we introduce a graph-aware curriculum that schedules training by the structural complexity of information-seeking trajectories using an easy-to-hard biased sampler.
We evaluate GraphDancer on a multi-domain benchmark by training on one domain only and testing on unseen domains and out-of-distribution question types. Despite using only a 3B backbone, GraphDancer outperforms baselines equipped with either a 14B backbone or \texttt{GPT-4o-mini}, demonstrating robust cross-domain generalization of graph exploration and reasoning skills. Our code is available at \url{https://anonymous.4open.science/r/GraphDancer}.
Paper Type: Long
Research Area: NLP Applications
Research Area Keywords: knowledge graphs, clinical NLP, legal NLP
Contribution Types: NLP engineering experiment, Publicly available software and/or pre-trained models
Languages Studied: English
Submission Number: 4063
Loading