Self-Evolving Language Models via Simple Generator-Verifier Games

Zhenting Qi; Susanna Maria Baby; Stefanie Anna Baby; Andrew Tomkins; Tu Vu; Da-Cheng Juan; Cyrus Rashtchian

Self-Evolving Language Models via Simple Generator-Verifier Games

Zhenting Qi, Susanna Maria Baby, Stefanie Anna Baby, Andrew Tomkins, Tu Vu, Da-Cheng Juan, Cyrus Rashtchian

20 Sept 2025 (modified: 11 Feb 2026)Submitted to ICLR 2026EveryoneRevisionsBibTeXCC BY 4.0

Keywords: self-training, post-training, language model

TL;DR: We analyze self-improvement with self-generated data.

Abstract: Post-training language models often depends on costly external signals such as human annotations or domain-specific rewards. As an alternative, we explore model self-evolution through the lens of simple generator–verifier games. A single base model plays both roles---generating candidate solutions and verifying/improving their quality---to construct preference data for fine-tuning. To extract reliable signals from noisy self-verification, we leveraging _thresholded majority voting_, which approximates high-precision preference pairs. The approach enables self-evolution on synthetic logical reasoning and realistic mathematical reasoning tasks, even when models initially perform poorly. For example, on the Knights and Knaves benchmark, accuracy rises from 31.0% to **40.7%** with single-turn verification, **42.2%** with multi-turn verification, **44.1%** with iterative training, and **44.8%** with curriculum learning. Notably, models trained only on easier instances generalize effectively to harder test data, demonstrating _emergent easy-to-hard generalization_. These results show that simple generator-verifier games can unexpectedly enhance reasoning in small models, offering a new perspective on concurrent research in self-improvement and RL with verifiable rewards.

Primary Area: foundation or frontier models, including LLMs

Submission Number: 23592

Loading