Improving Language Model Self-Correction Capability with Meta-Feedback

Xinnuo Li; Yunxiang Zhang; Lu Wang

Improving Language Model Self-Correction Capability with Meta-Feedback

Xinnuo Li, Yunxiang Zhang, Lu Wang

28 Sept 2024 (modified: 02 Dec 2024)ICLR 2025 Conference Withdrawn SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Self-Correction, Meta-Feedback, Iterative Refinement, Feedback-on-Feedback (FoF), Natural Language Processing (NLP), Machine Learning, Zero-Shot Learning, Self-Refine, Model Performance Enhancement, Feedback Quality, GSM8K Dataset, MBPP Dataset, CSMT Dataset

TL;DR: Improving the self-correction capabilities of language models by leveraging meta-feedback to enhance feedback quality and overall performance.

Abstract: Large language models (LLMs) are capable of self-correcting their responses by generating feedback and refining the initial output. However, their performance may sometimes decline following self-correction, either because the feedback contains errors or due to unnecessarily attempting to refine an already accurate response. To address these limitations, we investigate whether the same LLM can generate meta-feedback that pinpoints errors in the feedback rather than the response, an ability that remains under-explored despite extensive research on LLMs' self-feedback generation. We design a novel self-correction prompting framework, Feedback-on-Feedback (FoF), which leverages meta-feedback to improve the feedback before refining the response. Our framework first samples multiple pieces of feedback for the initial response, and prompts the LLM to generate meta-feedback that analyzes the inconsistency between these feedback pieces. Based on the meta-feedback, the LLM generates refined feedback that subsequently guides the revision of the response. Our FoF framework consistently outperforms competitive baselines across two LLMs on three datasets, covering arithmetic reasoning, machine translation, and programming tasks. Specifically, FoF improves performance on GSM8K by 3.6 points (45.2% vs. 41.6% for the initial answer) and on MBPP by 6.4 points (51.7% vs. 45.3%) using the LLaMA-3-8B model.

Primary Area: generative models

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.

Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.

Reciprocal Reviewing: I understand the reciprocal reviewing requirement as described on https://iclr.cc/Conferences/2025/CallForPapers. If none of the authors are registered as a reviewer, it may result in a desk rejection at the discretion of the program chairs. To request an exception, please complete this form at https://forms.gle/Huojr6VjkFxiQsUp6.

Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.

No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.

Submission Number: 12748

Loading