InsBank: Evolving Instruction Subset for Ongoing Alignment

Jiayi Shi; Yiwei Li; Shaoxiong Feng; Peiwen Yuan; Xinglin Wang; Yueqi Zhang; Chuyi Tan; Boyuan Pan; Huan Ren; Yao Hu; Kan Li

InsBank: Evolving Instruction Subset for Ongoing Alignment

Jiayi Shi, Yiwei Li, Shaoxiong Feng, Peiwen Yuan, Xinglin Wang, Yueqi Zhang, Chuyi Tan, Boyuan Pan, Huan Ren, Yao Hu, Kan Li

19 Sept 2024 (modified: 08 Dec 2024)ICLR 2025 Conference Withdrawn SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Large Language Model, Instruction Tuning, Data Efficient Training

Abstract: Pre-trained large language models (LLMs) typically undergo instruction fine-tuning to improve alignment. Recent research highlights that the quality and diversity of instruction data are more critical than data quantity, prompting the selection of diverse, high-quality instruction subsets to reduce training costs. However, how to evolve these selected subsets alongside the development of new instruction data remains insufficiently explored. To achieve LLMs' ongoing alignment, we introduce Instruction Bank (InsBank), a continuously updated repository that integrates the latest valuable instructional data. We further propose Progressive Instruction Bank Evolution (PIBE), a novel framework designed to evolve InsBank effectively and efficiently over time. It firstly employs a gradual data selection strategy to maintain long-term efficiency, utilizing a representation-based diversity score that captures relationships between data points and retains historical information for comprehensive diversity evaluation. This also allows for flexible combination of diversity and quality scores during data selection and ranking. Extensive experiments demonstrate that PIBE significantly outperforms baseline methods in evolving InsBank. Additionally, PIBE enables users to flexibly extract smaller subsets based on their specific budget.

Supplementary Material: zip

Primary Area: applications to computer vision, audio, language, and other modalities

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.

Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.

Reciprocal Reviewing: I understand the reciprocal reviewing requirement as described on https://iclr.cc/Conferences/2025/CallForPapers. If none of the authors are registered as a reviewer, it may result in a desk rejection at the discretion of the program chairs. To request an exception, please complete this form at https://forms.gle/Huojr6VjkFxiQsUp6.

Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.

No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.

Submission Number: 1801

Loading