AlpaGasus: Training A Better Alpaca with Fewer DataDownload PDFOpen Website

31 Jul 2023 (modified: 31 Jul 2023)OpenReview Archive Direct UploadReaders: Everyone
Abstract: Large language models (LLMs) obtain instruction-following capability through instruction-finetuning (IFT) on supervised instruction/response data. However, widely used IFT datasets (e.g., ALPACA’s 52k data) surprisingly contain many low-quality instances with incorrect or irrelevant responses, which are misleading and detrimental to IFT. In this paper, we propose a simple and effective data selection strategy that automatically identifies and removes low-quality data using a strong LLM (e.g., ChatGPT). To this end, we introduce ALPAGASUS, which is finetuned on only 9k high-quality data filtered from the 52k ALPACA data. ALPAGASUS significantly outperforms the original ALPACA as evaluated by GPT4 on multiple test sets and its 13B variant matches > 90% performance of its teacher LLM (i.e., Text-Davinci-003) on test tasks. It also provides 5.7x faster training, reducing the training time for a 7B variant from 80 minutes (for ALPACA) to 14 minutes. Overall, ALPAGASUS demonstrates a novel data-centric IFT paradigm that can be generally applied to instruction-tuning data, leading to faster training and better instruction-following models.
0 Replies

Loading