Keywords: Large Language Models, Critique Model, Synthetic Data, Mathematical Reasoning, Scalable Oversight
TL;DR: SCRIT enables large language models to evolve their critique capabilities without human oversight by learning from self-generated data.
Abstract: Despite their remarkable performance, Large Language Models (LLMs) face a critical challenge: providing feedback for tasks where human evaluation is difficult or where LLMs potentially outperform humans. In such scenarios, leveraging the *critique* ability of LLMs themselves—identifying and correcting flaws—shows considerable promise. This paper explores enhancing critique abilities of LLMs, noting that current approaches rely on human annotations or more powerful models, leaving the challenge of improving critique abilities *without* external supervision *unresolved*. We introduce SCRIT (Self-evolving CRITic), a framework that trains LLMs with self-generated data to evolve their critique abilities. We find that naive data generation approaches often produce superficial critiques of low quality. To address this limitation, we propose a contrastive-critic approach that uses reference solutions to enhance the understanding of LLMs for relevant concepts and incorporates a self-validation scheme to further improve data quality. Implemented with Qwen2.5-72B-Instruct, a leading LLM, SCRIT demonstrates consistent improvements: a 10.0\% relative gain in critique-correction accuracy and a 19.0\% relative improvement in error identification F1-score across various benchmarks. Our analysis reveals that SCRIT's performance scales positively with data and model size and enables continuous improvement through multi-round iterations.
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the COLM Code of Ethics on https://colmweb.org/CoE.html
Author Guide: I certify that this submission complies with the submission instructions as described on https://colmweb.org/AuthorGuide.html
Submission Number: 437
Loading