Confidence Matters: Revisiting Intrinsic Self-Correction Capabilities of Large Language ModelsDownload PDF

Anonymous

16 Feb 2024 (modified: 02 Dec 2024)ACL ARR 2024 February Blind SubmissionReaders: Everyone
Abstract: The recent success of Large Language Models (LLMs) has catalyzed an increasing interest in their self-correction capabilities. This paper presents a comprehensive investigation into the intrinsic self-correction of LLMs, attempting to address the ongoing debate about its feasibility. Our research has identified an important latent factor - the "confidence" of LLMs - during the self-correction process. Overlooking this factor may cause the models to over-criticize themselves, resulting in unreliable conclusions regarding the efficacy of self-correction in LLMs. To address the over-criticism issue, we introduce an If-or-Else (IoE) prompting principle, which guides LLMs to judge themselves with intrinsic ``confidence'', enabling effective self-correction without relying on external feedback or human-annotated examples.
Paper Type: long
Research Area: Dialogue and Interactive Systems
Contribution Types: NLP engineering experiment
Languages Studied: English
0 Replies

Loading

OpenReview is a long-term project to advance science through improved peer review with legal nonprofit status. We gratefully acknowledge the support of the OpenReview Sponsors. © 2025 OpenReview