KnowPO: Knowledge-Aware Preference Optimization for Controllable Knowledge Selection in Retrieval-Augmented Language Models
Abstract: By integrating external knowledge, Retrieval-Augmented Generation (RAG) has become an effective strategy
for mitigating the hallucination problems that large language models (LLMs) encounter when dealing with
knowledge-intensive tasks. However, in the process of integrating external non-parametric supporting evidence
with internal parametric knowledge, inevitable knowledge conflicts may arise, leading to confusion in the
model’s responses.
To enhance the knowledge selection of LLMs in various contexts, some research has focused on refining their
behavior patterns through instruction-tuning. Nonetheless, due to the absence of explicit negative signals
and comparative objectives, models fine-tuned in this manner may still exhibit undesirable behaviors such
as contextual ignorance and contextual overinclusion.
To this end, we propose a Knowledge-aware Preference Optimization strategy, dubbed KnowPO, aimed at achieving
adaptive knowledge selection based on contextual relevance in real retrieval scenarios. Concretely, we
proposed a general paradigm for constructing knowledge conflict datasets, which comprehensively cover
various error types and learn how to avoid these negative signals through preference optimization methods.
Simultaneously, we proposed a rewriting strategy and data ratio optimization strategy to address preference
imbalances.
Experimental results show that KnowPO outperforms previous methods for handling knowledge conflicts by over
37%, while also exhibiting robust generalization across various out-of-distribution datasets.
Loading