Knowledge Introspection: A Self-reflection Method for Reliable and Helpful Large Language Models

Knowledge Introspection: A Self-reflection Method for Reliable and Helpful Large Language Models

ACL ARR 2024 June Submission1491 Authors

14 Jun 2024 (modified: 02 Aug 2024)ACL ARR 2024 June SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Abstract: Large Language Models (LLMs) have inherent knowledge deficiency due to insufficient or erroneous data and incomplete training strategies. Furthermore, LLMs are often overconfident and unaware of their own knowledge deficiency, which will pose safety and legal risks to users. Inspired by the process of human introspection, we propose a two-stage method that enables LLMs to master the capability of knowledge introspection. Our method relies on data only generated by the LLM itself, and makes the LLM distinguish among what is known, uncertain and unknown. The method is trained in two-stages, in which supervised fine-tuning is employed in the first stage and direct preference optimization is utilized in the second stage. Experimental results demonstrate that our method effectively enhances the LLM's understanding of its internal knowledge, significantly improves generation accuracy, reliability and helpfulness of the model responses.

Paper Type: Long

Research Area: Language Modeling

Research Area Keywords: Large Language Models, Knowledge Introspection

Contribution Types: NLP engineering experiment, Publicly available software and/or pre-trained models

Languages Studied: english

Submission Number: 1491

Loading