RLKGF: Reinforcement Learning from Knowledge Graph Feedback Without Human Annotations

RLKGF: Reinforcement Learning from Knowledge Graph Feedback Without Human Annotations

ACL ARR 2025 February Submission6037 Authors

16 Feb 2025 (modified: 09 May 2025)ACL ARR 2025 February SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Abstract: Reinforcement Learning from Human Feedback (RLHF) has been shown to effectively aligns large language models (LLMs) with human knowledge. However, the requirement for human preference labels remains a significant bottleneck when applying RLHF to a down-stream domain. Existing evaluations of LLMs primarily focus on the semantic relevance between questions and responses, as well as the accuracy of the reasoning paths, which align with the implicit semantics and explicit structural links in knowledge graphs (KGs). Inspired by this observation, we propose Reinforcement Learning from Knowledge Graph Feedback (RLKGF), a novel method that leverages KG semantics and structure to derive RL rewards in the absence of manual annotations. Unlike Reinforcement Learning from AI Feedback (RLAIF), RLKGF directly integrates human priors encoded in KGs as the reward model, aligning LLM responses with expert knowledge without additional preference labeling or reward model training. RLKGF structures context-relevant facts into knowledge subgraphs and defines rewards by simulating information flow across semantic and logical connections between question and candidate response entities. Experiments on three public and one private medical dialogue dataset demonstrate that RLKGF significantly outperforms the competitive RLAIF in improving LLM diagnostic accuracy, which highlight the effectiveness of KG-based reward feedback for LLM knowledge alignment. Code will be available.

Paper Type: Long

Research Area: NLP Applications

Research Area Keywords: healthcare applications, clinical NLP, knowledge graphs

Contribution Types: Model analysis & interpretability, NLP engineering experiment, Data resources

Languages Studied: Chinese

Submission Number: 6037

Loading