<div align="center">

# Resisting Contextual Interference in RAG via Parametric-Knowledge Reinforcement


</div>


**Knowledgeable-R1** is an effective strategy for the RL training of LLMs that using joint sampling and define multi policy distributions in knowledge capability exploration to stimulate large language models’ self-integrated utilization of parametric and contextual knowledge. Experiments show that Knowledgeable-R1 significantly improves robustness and reasoning accuracy in knowledge conflict scenarios and general RAG scenarios, outperforming SOTA baselines by 23\% in counterfactual scenarios, and without degradation when the retrieved context is fully accurate.

🎯 **Key Benefits**:
- **No additional cost** — only the rollout strategy and RL objective is modified 
- **Easy to adopt** — no additional components or complex multiple prompt pipelines are required in application  
- **Superior generalization** — Knowledgeable-r1 significantly enhances robustness and reasoning accuracy in both parameters and contextual conflict tasks and general RAG tasks


## 🙌 Environment
The runtime environment is in the requirements.txt
so you can
``` bash
pip install -r requirements.txt
```


## Training
Run the following command:
### GRPO W/ RAG
``` bash
bash training_scripts/qwen2_5_7b_knowledge_confiqa_mc_grpo.sh 
```
### Ours
``` bash        
bash training_scripts/qwen2_5_7b_knowledge_confiqa_mc_knowledgeable_r1.sh
```
### Evaluation
``` bash
bash eval_query_only.sh     #for query only
bash eval_query_with_rag.sh #for RAG
bash eval_ours.sh
```
