Abstract: Large Language Models (LLMs) have inherent knowledge deficiency due to insufficient or erroneous data and incomplete training strategies. Furthermore, LLMs are often overconfident and unaware of their own knowledge deficiency, which will pose safety and legal risks to users. Inspired by the process of human introspection, we propose a two-stage method that enables LLMs to master the capability of knowledge introspection. Our method relies on data only generated by the LLM itself, and makes the LLM distinguish among what is known, uncertain and unknown. The method is trained in two-stages, in which supervised fine-tuning is employed in the first stage and direct preference optimization is utilized in the second stage. Experimental results demonstrate that our method effectively enhances the LLM's understanding of its internal knowledge, significantly improves generation accuracy, reliability and helpfulness of the model responses.
Paper Type: Long
Research Area: Language Modeling
Research Area Keywords: Large Language Models, Knowledge Introspection
Contribution Types: NLP engineering experiment, Publicly available software and/or pre-trained models
Languages Studied: english
Submission Number: 1491
Loading