Confidence Matters: Revisiting Intrinsic Self-Correction Capabilities of Large Language Models

Anonymous

Confidence Matters: Revisiting Intrinsic Self-Correction Capabilities of Large Language Models

Anonymous

16 Feb 2024 (modified: 02 Dec 2024)ACL ARR 2024 February Blind SubmissionReaders: Everyone

Abstract: The recent success of Large Language Models (LLMs) has catalyzed an increasing interest in their self-correction capabilities. This paper presents a comprehensive investigation into the intrinsic self-correction of LLMs, attempting to address the ongoing debate about its feasibility. Our research has identified an important latent factor - the "confidence" of LLMs - during the self-correction process. Overlooking this factor may cause the models to over-criticize themselves, resulting in unreliable conclusions regarding the efficacy of self-correction in LLMs. To address the over-criticism issue, we introduce an If-or-Else (IoE) prompting principle, which guides LLMs to judge themselves with intrinsic ``confidence'', enabling effective self-correction without relying on external feedback or human-annotated examples.

Paper Type: long

Research Area: Dialogue and Interactive Systems

Contribution Types: NLP engineering experiment

Languages Studied: English

0 Replies

Loading