Abstract: We present SciRIFF (Scientific Resource for Instruction-Following and Finetuning), a dataset of 137K instruction-following instances covering 54 tasks. These tasks span five core scientific literature understanding capabilities: information extraction, summarization, question answering, claim verification, and classification. SciRIFF is unique in being the only all expert-written, high-quality instruction- following dataset designed for extracting and synthesizing information from research litera- ture across diverse scientific fields. It features complex instructions with long input contexts, detailed task descriptions, and structured out- puts. To demonstrate its utility, we finetune a series of large language models (LLMs) us- ing a mix of general-domain and SciRIFF in- structions. On nine out-of-distribution held-out tasks (referred to as SciRIFF-Eval), LLMs finetuned on SciRIFF achieve 70.6% average improvement over our baselines trained only on general-domain instructions. SciRIFF facilitates the development and evaluation of LLMs to help researchers navigate the rapidly growing body of scientific literature.
Paper Type: Long
Research Area: Resources and Evaluation
Research Area Keywords: scholarly document processing, instruction-following, large language models
Contribution Types: NLP engineering experiment, Publicly available software and/or pre-trained models, Data resources
Languages Studied: English
Submission Number: 2300
Loading