KALE: Enhancing Knowledge Manipulation in Large Language Models via Knowledge-aware Learning

KALE: Enhancing Knowledge Manipulation in Large Language Models via Knowledge-aware Learning

ACL ARR 2026 January Submission5516 Authors

05 Jan 2026 (modified: 20 Mar 2026)ACL ARR 2026 January SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Large Language Models, Knowledge Manipulation, Post-training

Abstract: Despite the impressive performance of large language models (LLMs) pretrained on vast knowledge corpora, advancing their knowledge manipulation—the ability to effectively **recall, reason, and transfer relevant knowledge**—remains challenging. Existing methods mainly leverage Supervised Fine-Tuning (SFT) on labeled datasets to enhance LLMs' knowledge manipulation ability. However, we observe that SFT models still exhibit the *known&incorrect* phenomenon, where they explicitly possess relevant knowledge for a given question but fail to leverage it for correct answers. To address this challenge, we propose KALE (**K**nowledge-**A**ware **LE**arning)—a post-training framework that leverages knowledge graphs (KGs) to generate high-quality rationales and enhance LLMs' knowledge manipulation ability. Specifically, KALE first introduces a **K**nowledge-**I**nduced (KI) data synthesis method that efficiently extracts multi-hop reasoning paths from KGs to generate high-quality rationales for question-answer pairs. Then, KALE employs a **K**nowledge-**A**ware (KA) fine-tuning paradigm that enhances knowledge manipulation by internalizing rationale-guided reasoning through minimizing the KL divergence between predictions with and without rationales. Extensive experiments on eight popular benchmarks across six different LLMs demonstrate the effectiveness of KALE, achieving accuracy improvements of up to 11.72% and an average of 4.18%.

Paper Type: Long

Research Area: NLP Applications

Research Area Keywords: Language Modeling, Data augmentation

Contribution Types: Model analysis & interpretability, NLP engineering experiment, Reproduction study, Data resources, Data analysis

Languages Studied: English, Chinese, French

Submission Number: 5516

Loading