AlpaGasus: Training A Better Alpaca with Fewer Data

Lichang Chen, Shiyang Li, Jun Yan, Hai Wang, Vikas Yadav, Kalpa Gunaratna, Zheng Tang, Vijay Srinivasan, Tianyi Zhou, Heng Huang, Hongxia Jin

31 Jul 2023 (modified: 31 Jul 2023)OpenReview Archive Direct UploadReaders: Everyone

Abstract: Large language models (LLMs) obtain instruction-following capability through instruction-finetuning (IFT) on supervised instruction/response data. However, widely used IFT datasets (e.g., ALPACA’s 52k data) surprisingly contain many low-quality instances with incorrect or irrelevant responses, which are misleading and detrimental to IFT. In this paper, we propose a simple and effective data selection strategy that automatically identifies and removes low-quality data using a strong LLM (e.g., ChatGPT). To this end, we introduce ALPAGASUS, which is finetuned on only 9k high-quality data filtered from the 52k ALPACA data. ALPAGASUS significantly outperforms the original ALPACA as evaluated by GPT4 on multiple test sets and its 13B variant matches > 90% performance of its teacher LLM (i.e., Text-Davinci-003) on test tasks. It also provides 5.7x faster training, reducing the training time for a 7B variant from 80 minutes (for ALPACA) to 14 minutes. Overall, ALPAGASUS demonstrates a novel data-centric IFT paradigm that can be generally applied to instruction-tuning data, leading to faster training and better instruction-following models.

0 Replies