I-SHEEP: Self-Alignment of LLM from Scratch through an Iterative Self-Enhancement Paradigm

ACL ARR 2024 December Submission1035 Authors

15 Dec 2024 (modified: 05 Feb 2025)ACL ARR 2024 December SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Abstract: Large Language Models (LLMs) have achieved significant advancements, however, the common learning paradigm treats LLMs as passive information repositories, neglecting their potential for active learning and alignment. Some approaches train LLMs using their own generated synthetic data, exploring the possibility of active alignment. However, there is still a huge gap between these one-time alignment methods and the continuous automatic alignment of humans. In this paper, we introduce **I-SHEEP**, an**I**terative **S**elf-En**H**anc**E**m**E**nt **P**aradigm. This human-like paradigm enables LLMs to **iteratively self-improve even in low-resource scenarios**. Compared to the one-time alignment method Dromedary \cite{sun2023principledriven}, which refers to the first iteration in this paper, I-SHEEP can significantly enhance capacities on both Qwen and Llama models. I-SHEEP achieves a maximum relative improvement of 78.2\% in the Alpaca Eval, 24.0\% in the MT Bench, and an absolute increase of 8.88\% in the IFEval accuracy over subsequent iterations in Qwen-1.5 72B model. Additionally, I-SHEEP surpasses the base model in various standard benchmark generation tasks, achieving an average improvement of 24.77\% in code generation tasks, 12.04\% in TrivialQA, and 20.29\% in SQuAD. We also provide new insights based on the experiment results. Our code, datasets, and models are available at https://anonymous.4open.science/r/SHEEP/.
Paper Type: Long
Research Area: Efficient/Low-Resource Methods for NLP
Research Area Keywords: data-efficient training;
Contribution Types: Approaches to low-resource settings, Data resources
Languages Studied: English
Submission Number: 1035
Loading

OpenReview is a long-term project to advance science through improved peer review with legal nonprofit status. We gratefully acknowledge the support of the OpenReview Sponsors. © 2025 OpenReview